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Preface 


These  proceedings  contain  the  papers  presented  at  the  Eighth  Annual  IFIP  WG  11.3  Working 
Conference  on  Database  Security  held  in  Bad  Salzdetfurth  near  Hildesheim,  Germany,  August 
23-26,  1994.  The  conference  has  been  sponsored  jointly  by  the  International  Federation  for 
Information  Processsing  (IFIP)  Working  Group  1 1.3  on  Database  Security  and  the  University  of 
Hildesheim.  Furthermore  the  U.S.  Office  of  Naval  Research  supported  travel  of  many  U.S. 
participants.  As  with  its  predecessors  the  purpose  of  the  conference  was  to  present  original  work 
in  database  security  research  and  practice,  to  enable  participants  to  benefit  from  personal 
scientific  discussions  and  to  expand  their  knowledge,  to  support  the  activities  of  the  Working 
Group,  and  to  disseminate  its  results. 

All  submitted  papers  were  reviewed  by  working  group  members  and  a  small  number  of  external 
reviewers.  Based  on  these  evaluations  18  papers  were  finally  accepted  for  presentation.  In 
addition  the  program  included  a  session  on  status  reports  on  current  projects  and  a  panel 
discussion  on  perspectives  on  database  security.  These  sessions  are  also  documented  in  these 
proceedings  by  short  abstracts. 

We  highly  appreciated  two  invited  lectures  that  are  also  included  in  these  proceedings.  Ab 
Bakker  from  BAZIS,  Leiden,  Netherlands,  presents  his  deep  insight  into  security  in  health  care 
systems,  and  thereby  we  were  able  to  continue  the  discussion  of  security  in  this  important  field  of 
application.  Klaus  Dittrich  from  University  of  Zurich,  Switzerland,  challenges  the  security 
community  with  current  trends  in  database  technology  to  which  he  himself  has  substantially 
contributed. 

A  Working  Conference  is  based  on  the  joint  efforts  of  many  people.  We  thank  all  of  them 
sincerely;  the  authors  of  submitted  papers,  the  reviewers,  the  participants,  in  particular  the  invited 
speakers.  We  are  also  particularly  grateful  to  Jimmy  Briiggemann  and  Christian  Eckert  for  local 
organization. 
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SPECIAL  CARE  NEEDED  FOR  THE  HEART  OF  MEDICAL  INFORM^ATTON 
SYSTEMS 

ALBERT  R.  BARKER 
BAZIS  FOUNDATION 

SCHIPHOLWEG  97  2316XA  LEIDEN,  THE  NETHERLANDS 

abstract 

First  characteristics  of  medical  information  systems  are 
cLescribed.  Next  the  role  of  the  database  is  considered  in  more 
c'etail,  the  resulting  requirements  for  security  of  medical 
databases  are  considered. 

Special  attention  is  given  to  the  requirements  resulting 
from  use  of  the  database  for  medical  audit.  It  is  concluded 
that  in  databases  holding  patient  care  data  strict  time-stam¬ 
ping  is  required.  Even  after  expiration  of  the  specified 
storage  period  there  has  to  be  a  certain  period  when  the  data 
should  be  dormant,  in  that  state  they  are  accessible  by  the 
auditor  only. 

1 .  INTRODUCTION 

Data  play  an  important  role  in  modern  health  care.  Data  are 
recorded  for  several  purposes: 

-  supporting  the  own  memory  of  the  health  care  professional, 
to  allow  him  to  do  his  diagnostic  and  therapeutic  work, 

-  as  a  communication  vehicle  between  members  of  the  team  that 
is  caring  for  the  patient, 

-  as  a  communication  vehicle  with  medical  support  departments, 
like  laboratories,  radiology,  pharmacy,  etc., 

-  as  input  to  the  financial  administration,  the  reimbursement, 
the  budgetting  process  and  the  management  of  the  care  facili- 
ty, 

-  as  potential  input  for  (epidemiological)  research  and  educa¬ 
tion. 

With  the  growth  of  medical  knowledge  and  medical  technology 
the  amount  and  diversity  of  data  to  be  recorded  is  increasing 
rapidly.  Because  also  the  need  for  medical  specialisation  is 
increasing  the  need  for  communication  with  other  specialists 
(also  those  v/ithin  specialised  support  departments)  is  increa¬ 
sing.  This  lead  to  a  situation  where  it  is  estimated  that 
handling  of  information  in  a  broad  sense  (including  images) 
generates  about  30%  of  the  costs  of  a  hospital. 

For  the  provision  of  health  care  we  find  a  wide  range  of 
institutions  and  facilities  (hospital,  general  practitioner, 
pharmacy,  community  care,  fysiotherapist ,  etc)  .  Until  now  the 
amount  of  data  exchanged  between  these  different  health  care 
providing  organisat ions/persons  is  limited,  the  exchange  of 
information  is  nov;  primarily  within  the  walls  of  the  institu¬ 
tion.  Both  the  emerging  technical  facilities  for  communication 
(WANs)  and  the  need  to  improve  the  efficiency  in  health  care 
v;ill  lead  to  more  emphasis  on  external  communication. 

Medical  data  about  patients  often  are  of  a  rather  sensitive 
nature _  and  deserve  protection,  the  "medical  secret"  is  a 
worldwide  principle  and  mentioned  explicitly  in  the  professio¬ 
nal  oath  (Hyppocratic  oath)  [1].  As  long  as  medical  care  was 
supplied  in  a  direct  one-to-one  patient-doctor  relation  the 
application  of  the  medical  secret  was  almost  unambiguous. 
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However,  in  modern  health  care  many  persons  are  involved  like: 
other  members  of  the  care  team  (doctors  as  well  as  nurses)  , 
secretaries,  laboratory  technicians,  medical  records  clerks, 
administrators.  In  addition  to  that,  insurance  companies  ask 
for  data  as  well  as  researchers.  The  practical  rules  for 
access  to  patient  data  differ  between  countries  and  between 
institutions  albeit  that  the  general  principles  as  described 
in  the  Council  of  Europe's  Regulations  for  Automated  Medical 
Databanks  R(81)  [2]  are  rather  well  accepted  and  are  the  base 
for  most  data  protection  laws. 

2.  CHARACTERISTICS  OF  MEDICAL  INFORMATION  SYSTEMS 

When  security  is  concerned  we  find  in  general  special 
attention  for  medical  information  systems.  In  this  respect  can 
be  mentioned  ;  the  recommendation  of  the  Council  of  Europe  and 
special  clauses  in  many  national  laws  and  regulations  refer¬ 
ring  to  medical  information  systems.  Furthermore  we  find 
dedicated  committees  (e.g.  CEN  TC251  wg  6;  IMIA  wg4 ;  EFMI 
wg2),  and  dedicated  (working)  conferences  [3], [4], [5],  [6], [7]. 

What  are  the  reasons  for  such  special  attention?  Are  there 
special  characteristics  of  medical  information  systems  that 
justify  this  special  attention?  In  this  paragraph  this  questi¬ 
on  is  addressed  as  background  for  the  security  aspects  of 
medical  databases.  A  general  overview  of  security  aspects  of 
medical  information  systems  can  be  found  in  [8]. 

Although  we  find  information  systems  in  almost  any  type  of 
health  care  organisation  we  will  focus  here  on  the  most  com¬ 
plex  example  of  medical  information  systems,  the  systems  that 
support  large  hospitals.  Such  hospital  information  systems 
(HIS)  offer  a  wide  range  of  functions  to  a  wide  range  of  users 
(of  different  disciplines)  [9].  Because  of  the  interrelation 
of  activities  in  the  hospital  integration  is  a  key  characte¬ 
ristic  of  successful  hospital  information  systems.  At  present 
the  functions  of  such  systems  are,  apart  from  some  EDI  facili¬ 
ties  to  support  communication  with  the  outside  world,  serving 
the  internal  information  processes  within  the  institution. 

A  central  point  of  reference  in  such  system  is  the  patient. 
Data  about  his  health  condition  and  treatment  are  needed  by 
many  workers  in  the  hospital.  The  data  have  to  be  available  at 
the  workplace,  at  the  moment  they  are  needed,  in  a  presentati¬ 
on  geared  towards  the  needs  of  the  specific  user.  However, 
only  those  data  have  to  be  presented  that  the  user  is  allowed 
to  see  ("need  to  know  principle"). 

The  data  are  stored  in  a  database.  The  collection  of  data 
of  one  patient  is  often  referred  to  as  his  "electronic  medical 
record",  to  indicate  that  the  data  are  gradually  becoming  the 
electronic  equivalent  of  the  paper  file  holding  the  patient 
data.  This  electronic  medical  record  until  now  covers  only 
alpha-numeric  data.  Images  are  only  stored  in  information 
systems  in  seldom  cases,  e.g.  in  Picture  Archiving  and  commu¬ 
nication  Systems  (PACS)  [10].  In  this  paper  special  security 
requirements  for  PACS  will  not  be  discussed. 

Data  in  a  HIS  are  intended  to  play  a  role  in  the  care 
process.  This  process  goes  on  continuously,  at  least  for 
inpatients,  so,  by  consequence  the  functions  of  the  system  and 
the  data  about  the  patient  have  to  be  available  round-the- 
clock  . 
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Because  the  data  presented  by  the  system  play  a  direct  role 
in  the  care  process  their  integrity  is  important.  Wrong  data 
presented  may  harm  the  patient  directly  because  they  may 
trigger  wrong  actions. 

Although  the  principles  for  access  rights  to  patient  data 
are  the  same  in  most  institutions,  the  translation  into  rules 
to  be  applied  for  checking  these  access  rights  may  differ 
significantly  between  institutions  for  internal  access.  For 
access  of  external  users  the  differences  are  often  even  lar¬ 
ger.  The  guestion  whether  the  patient  has  the  right  of  access 

to  his  own  data  is  answered  differently  in  different  coun¬ 

tries.  In  some  countries  the  answer  is  a  clear  unconditional 
*'yes"  while  in  other  countries  only  access  is  allowed  through 
a  physician  as  an  intermediary. 

Within  an  institution  the  right  of  access  is  not  only 
determined  by  the  position/profession  of  the  user,  but  also  by 
the  answer  to  the  question  whether  he  is  involved  in  the  care 
process  and  if  so  in  what  role.  So  a  physician  will  have  the 
right  to  read  diagnoses  recorded  in  the  system,  but  only  for 
"his  patients".  Whether  a  patient  is  "his  patient"  is  determi¬ 
ned  by  looking  at  logistic  patient  data  (admission  data, 

appoinment  data,  waiting  lists) .  These  data  being  recorded 
within  the  database  itself. 

For  access  to  patient  data  in  case  of  medical  emergencies 
special  provisions  have  to  be  made.  In  such  situations  it  is 
often  impossible  to  check  automatically  whether  the  health 
care  professional  has  a  care  relation  with  the  patient.  A  way 
out  can  be  found  by  assigning  some  users  (doctors  and  nurses) 
the  special  authority  to  bypass  (in  emergency  situations)  the 
check  on  patient  relation.  When  the  system  refuses  access 
because  no  such  relation  can  be  detected  they  can  state  that 
the  request  is  dealing  with  an  emergency  (supported  by  an 
input  message  of  a  minimum  length  explaining  the  reason  for 
the  request)  after  which  the  requested  data  are  supplied.  Of 
this  violation  of  the  protection  a  message  is  sent  by  the 
system  to  a  supervising  authority  (e.g.  the  head  of  the  medi¬ 
cal  records  department,  or  the  medical  director)  who  checks 
v/hether  the  emergency  really  occurred. 

Sometimes  it  is  attractive  from  a  perspective  of  efficiency 
that  several  hospitals  share  the  same  computer  centre  and  run 
the  same  hospital  information  system.  As  long  as  the  databases 
and  the  software  are  kept  strictly  separated  this  does  not 
lead  to  complications.  However  when  the  hardware  facilities 
are  shared  there  will  be  a  strong  pressure  to  share  also 
certain  categories  of  data  like:  general  identification  data, 
patient  identifier,  patient  insurance  status.  As  soon  as  such 
sharing  is  made  possible  there  will  be  a  demand  to  share  more 
data,  e.g.  results  of  laboratory  tests.  Such  sharing  of  data 
may  complicate  the  database  management. 

The  requirements  for  access  to  the  HIS  and  its  availability 
are  high.  Nowadays  for  a  mature  HIS  the  number  of  terminals  or 
workstations  exceeds  the  number  of  beds  of  the  hospital.  The 
availability  should  be  at  least  99.7%  round-the-clock. 

3 .  THE  MEDICAL  DATABASE 

In  the  database  different  categories  of  data  will  be  sto¬ 
red  : 
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-  patient  data,  both  medical,  administrative  and  logistic 
data.  Examples  of  medical  data  are:  results  of  laboratory 
tests,  discharge  letters,  diagnoses,  vital  signs,  diets,  etc. 
Examples  of  administrative  data  are:  insurance  data,  (pending) 
invoices ; 

-  data  on  the  resources  of  the  institution  and  their  utilisa¬ 
tion,  e.g.  personnel  data,  budgets,  stocks,  bed  occupancy, 
etc ; 

-  logistic  data;  waiting  lists,  appointment  schedules,  reser¬ 
vations,  etc 

-  reference  data,  like  diagnosis  codes,  list  of  drugs,  general 
practitioners,  treatment  protocols,  etc. 

Although  the  variety  of  data  is  large  and  the  volume  of 
transactions  will  be  high,  the  database  can  in  general  logi¬ 
cally  be  considered  as  relational,  although  often  for  effi¬ 
ciency  reasons  a  special  structure  is  chosen.  For  most  records 
representing  medical  facts  multi-occurencies  are  to  be  expec¬ 
ted,  e.g.  the  result  of  an  ECG  may  occur  several  times  in  the 
medical  record  of  a  patient. 

When  several  institutions  decide  to  share  the  computer 
facilities  the  implementation  of  several  incarnations  of  the 
system  on  the  same  computer  in  principle  does  not  lead  to 
special  requirements.  However,  the  demand  will  arise  to  flexi¬ 
bly  share  data,  this  will  lead  to  special  requirements  for  the 
database  management  system. 

Some  quantitative  data  on  the  database  of  a  typical  Europe¬ 
an  university  hospital  (1000  acute  beds,  300,000  outpatient 
visits  per  year)  are  given  here  as  an  illustration: 

-  number  of  patients  registered  >  800,000 

-  number  of  diagnoses  recorded  >500,000 

-  number  of  radiology  reports  >  500,000 

-  number  of  lab  test  results  >  2,000,000 

-  number  of  terminals/workstations  >  1,000 

-  number  of  database  actions  (read,  write,  delete,  update)  in 
the  dayshift  10  -  15  M  per  day,  of  which  3%  additions 

-  total  volume  6  Gbytes 

-  number  of  tables  in  database  6,000 

-  number  of  different  attributes  20,000. 

4.  SECURITY  REQUIREMENTS  FOR  MEDICAL  DATABASES 

The  requirements  for  security  are  considered  here  for  the 
three  CIA  effects:  Confidentiality,  Integrity,  Availability. 

Confidentiality  requirements 

The  facilities  for  access  control  should  be  refined  and 
flexible.  Refinement  because  of  the  wide  variety  of  data  and 
the  wide  range  of  roles  of  users.  Flexible  to  allow  each 
institution  to  map  its  rules  for  access  on  the  facilities  of 
the  database  management  system.  Access  control  should  be  based 
on : 

-  access  patterns  for  the  different  user  roles  to  be  distin¬ 
guished, 

-  when  access  is  requested  to  patient  data,  the  existence  of  a 
care-relation  of  the  user  with  the  patient  concerned, 

-  the  existence  of  an  emergency  situation, 

-  the  entity  accessed,  and  sometimes  the  specific  attributes 

of  that  entity.  * 
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A  special  point  of  concern  are  queries  that  retrieve  data 
from  the  database,  not  directly  through  patient  identificati¬ 
on.  Such  search  questions  may  nevertheless  reveal  sensitive 
information,  even  if  only  the  number  of  occurrences  of  speci¬ 
fic  situations  are  reported. 

Integrity  requirements 

Because  data  from  medical  databases  play  a  role  in  the 
direct  care  process  already  now,  and  such  use  can  be  expected 
to  increase  rapidly,  special  attention  is  needed  for  the 
integrity  of  the  data.  Incorrect  data  might  lead  to  wrong 
diagnostic  or  therapeutic  decisions,  causing  damage  to  the 
health  of  the  patient.  Measures  have  to  be  taken  to: 

-  avoid  loss  or  destruction  of  data; 

-  unauthorized  modifications  in  the  programs; 

~  check  data  at  entry  on  consistency  and  plausibility  (taking 
into  consideration  other  data  stored  already  (irrespective  of 
the  access  rights  of  the  user) ; 

-  avoid  inconsistent  presentation  of  data  because  of  limited 
access  rights. 

The  latter  point  is  explored  here  slightly  further.  Polyin¬ 
stantiation  is  not  unusual  for  some  categories  of  data,  e.g. 
results  of  laboratory  tests.  For  other  categories  it  is  highly 
undesirable  (e.g.  bloodgroup)  .  It  should  be  specified  for 
which  categories  such  polyinstantiation  is  allowed.  Undesira¬ 
ble  polyinstantiation  could  be  avoided  by  applying  the  rule 
that  if  someone  is  not  allowed  to  read  a  certain  record  type 
he  is  not  allowed  to  create  it  either. 

As  a  typical  integrity  problem  in  medical  databases  we 
consider  here  the  situation  where  data  of  a  patient  are  stored 
under  two  different  patient  identifications,  that  lateron  turn 
out  to  be  dealing  with  the  same  patient.  Such  a  situation  may 
e.g.  occur  in  emergency  admissions  or  when  samples  for  labora¬ 
tory  tests  are  offered  v/ith  a  limited  amount  of  patient  iden¬ 
tification  data.  If  it  can  not  be  made  sure  that  the  patient 
IS  the  same  as  one  already  recorded  in  the  databank,  the  only 
option  is  to  consider  him  as  a  new  patient.  Lateron  it  may 
curn  out  that  the  patient  was  already  in  the  database  and  the 
medical  (and  administrative)  records  have  to  be  merged.  This 
merging  may  lead  to  conflicts  of  consistency,  that  v/ould 
normally  have  been  handled  in  an  interactive  way  at  the  moment 
of  data  entry.  Now  special  provisions  are  necessary  to  cope 
with  such  inconsistencies  by  e.g.  the  medical  records  officer. 

Such  consistency  problems  will  occur  on  a  large  scale  if 
two  institution  would  merge  and  by  consequence  would  like  to 
merge  their  databases.  The  problem  will  also  deserve  attention 
as  soon  as  two  institutions  in  a  region  decide  to  share  some 
categories  of  data. 

Availability  requirements 

When  medical  information  systems  play  an  important  role  in 
the  care  process,  as  is  for  instance  the  case  for  most  hospi¬ 
tal  information  systems,  there  is  a  need  for  high  availabili¬ 
ty.  Typical  requirements  are  availability  >  99.7%  round-the- 
clock.  Apart  from  the  availability  percentage  also  the  time 
needed  to  resume  operations  after  an  interruption  is  impor¬ 
tant.  Three  situations  can  be  distinguished: 

-  simple  restart  possible,  restoring  some  system  parameters 
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and  checking  vital  system  data;  this  typically  should  not  take 
more  than  10  minutes; 

-  database  damaged  (either  by  hardware  malfunction,  software 
problem  or  human  error) ;  recover  necessary  from  safe-copies 
plus  logged  mutations;  this  should  not  take  more  than  4-6 
hours ; 

-  computer  centre  (or  most  of  the  equipment  located  there) 
destroyed  by  a  disaster,  e.g  because  of  a  large  fire;  in  this 
situation  external  back-up  facilities  will  be  necessary,  e.g. 
a  mobile  or  a  remote  back-up  computer  centre.  In  the  latter 
case  data  communication  facilities  have  to  be  available. 
Although  such  disasters  will  only  occur  very  seldom  (less  than 
once  in  50  years)  the  interruption  of  services  in  the  hospital 
should  be  no  longer  than  24  hours. 

With  the  increasing  role  of  the  information  systems  in 
direct  patient  care  the  requirements  will  become  stricter. 
Mirroring  of  the  database  is  one  of  the  techniques  to  reduce 
the  risk  for  loss  of  the  database  followed  by  a  time-consuming 
recover  action.  So-called  "non-stop  operation"  should  be 
considered  seriously.  However,  it  should  be  realized  that  this 
provides  no  protection  against  disasters,  human  failures  and 
incorrect  new  versions  of  software  subsystems.  Anyhow  non-stop 
facilities  fall  beyond  the  database  software  as  such. 

5.  MEDICAL  AUDIT  ASPECTS 

Although  the  introduction  of  information  systems  in  the 
direct  patient  care  has  proceeded  slower  than  expected,  we  see 
nowadays  that  they  are  at  several  places  replacing  parts  of 
the  functions  of  the  paper  medical  record. 

Medical  audit  tries  to  answer  the  question  whether  a  health 
care  professional  acted  in  a  specific  situation  in  a  responsi¬ 
ble  way  in  view  of  what  he  knew  or  ought  to  have  known.  This 
both  refers  to  professional  knowledge  and  to  patient  data  of 
the  case.  The  interest  in  medical  audit  can  be  expected  to 
increase  in  view  of  the  tendency  to  more  often  raise  the  issue 
of  liability  when  the  result  of  the  medical  treatment  is  not 
v/hat  the  patient  expected.  If  an  information  system  is  availa¬ 
ble  to  the  health  care  professional  he  can  be  expected  to  use 
it  in  his  work.  Responsible  behaviour  will  also  comprise  use 
of  the  data  from  the  system. 

This  implies  that  in  an  audit  procedure  the  auditor  needs 
to  have  the  possibility  to  see  what  data  (on  a  patient)  a 
certain  health  care  professional  could  have  seen  at  a  certain 
moment.  Although  it  is  easily  recognized  that  such  a  require¬ 
ment  makes  sense,  there  are  no  documented  examples  of  its 
fulfilment  in  operational  systems.  Let  us  consider  the  impli¬ 
cations  : 

-  all  data  in  the  database  have  to  be  time-stamped,  to  be  able 
to  select  those  data  that  were  available  at  a  specified  mo¬ 
ment  ; 

-  after  commands  for  deletion  or  modification  of  data,  the  old 
values  have  to  be  preserved  in  a  way  that  they  are  not  shown 
in  routine  operation  of  the  system,  but  are  accessible  for  the 
audit  process; 

-  this  even  applies  for  data  that  have  to  be  deleted  after 
expiration  of  the  specified  storage  period;  at  least  as  long 
as  an  audit  should  still  be  possible.  This  requires  a  reformu- 
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lation  of  the  widely  accepted  principle  in  data  protection 
regulations  that  for  the  various  types  of  data  stored  the 
maximum  storage  period  has  to  be  specified  after  which  they 
must  be  destroyed; 

-  there  has  to  be  an  audit  mode,  offering  the  auditor  the 
possibility  to  masquerade  as  a  certain  health  care  professio¬ 
nal  (taking  on  his  rights)  and  giving  as  a  result  to  requests 
to  the  database  the  answers  as  they  would  have  been  at  a 
specified  moment  (in  the  past).  This  implies  that  also  the 
history  of  those  data  that  play  a  role  in  checking  access 
rights  should  be  stored,  which  is  not  always  current  practice 

Fulfilling  these  requirements  will  be  far  from  trivial,  it 
would  require  a  complete  overhaul  of  existing  medical  informa¬ 
tion  systems.  It  might  be  considered  to  use  as  a  vehicle  for 
the  audit  process  the  safe-copies  of  the  database  that  are 
made  daily.  This  would  still  require  an  audit  mode  (for  mas¬ 
querading  the  health  care  professional) ,  but  would  avoid  the 
strict  time-stamping,  the  maintaining  of  historical  data  and 
the  dormant  mode  of  deactivated  data.  However,  medical  audit 
will  often  have  to  deal  with  critical  health  situations  of 
patients,  in  those  situations  the  contents  of  the  electronic 
medical  record  will  change  rapidly.  A  snapshot  frequency  of 
once  per  day  is  most  probably  far  too  low.  This  is  underpinned 
by  the  average  length  of  stay  of  patients  that  is  around  10 
days  or  less  in  most  hospitals  in  the  western  world. 

It  should  be  emphasized  that  the  requirements  of  medical 
audit  as  formulated  here  have  not  yet  been  raised  by  the 
medical  auditors,  however  it  seems  highly  probable  that  we  can 
expect  them  in  a  few  years.  It  is  better  not  to  wait  until  the 
problem  arises,  the  various  disciplines  involved  should  consi¬ 
der  now  what  the  requirements  are  and  how  these  can  be  met  by 
the  technology.  Use  of  daily  safe-copies  may  be  an  interim 
solution  while  a  more  fundamental  solution  is  prepared. 

6.  CONCLUSION 

In  this  paper  database  security  for  medical  information 
systems  has  been  considered,  especially  for  hospital  informa¬ 
tion  systems.  Such  systems  are  found  to  be  very  complex,  the 
wide  variety  of  data  and  users  on  one  side,  and  on  the  other 
side  the  types  of  the  applications  make  them  an  interesting 
case  for  security  considerations. 

Already  now  such  systems  play  a  role  in  the  direct  patient 
care,  such  role  can  be  expected  to  increase  rapidly.  This  will 
have  as  consequence  that  the  facilities  of  the  system  will  be 
taken  into  consideration  in  medical  audit.  The  question  posed 
being: "did  the  health  care  professional  act  in  a  responsible 
manner,  in  view  of  the  information  that  was  available  to  him 
?"  Because  the  information  system  is  an  important  source  of 
information,  it  will  be  necessary  to  be  able  to  replay  the 
patient  information  that  at  a  certain  point  in  time  would  have 
been  available  for  the  health  care  professional  concerned.  The 
consequences  of  fulfilling  such  requirement  seem  to  be  drama¬ 
tic,  solutions  to  cope  with  this  challenge  deserve  attention. 
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Since  about  a  decade,  a  large  part  of  research  and  development  in  the  area  of 
database  management  systems  (DBMS)  is  being  devoted  to  the  issue  of  extending 
their  functionality.  The  general  goal  behind  these  efforts  is  to  efficiently  provide  ad¬ 
vanced  data  and  information  management  services  within  standard  software,  thus 
avoiding  the  need  for  their  repeated  design  and  incorporation  within  individual 
application  programs.  The  basis  of  all  such  approaches  is  -  compared  to  traditional 
(e.g.  relational)  systems  -  a  more  comprehensive  and  precise  representation  of 
real  world  semantics  within  the  database  itself.  In  consequence,  new  generation 
DBMS  are  about  to  make  database  technology  amenable  to  a  much  broader  range 
of  applications  than  traditional  products  do. 

From  the  viewpoint  of  security,  advanced  DBMS  concepts  present  both,  new  oppor¬ 
tunities  and  new  challenges.  We  will  look  at  three  prominent  examples,  namely 
object-oriented,  active,  and  federated  DBMS,  and  briefly  sketch  their  impact  on 
some  security  issues,  primarily  (discretionary)  access  control.  Furthermore,  vve  will 
touch  the  important  issues  of  DBMS  construction  and  security  design. 


Ob|ect-oriented  DBMS 

Object-oriented  DBMS  support  an  object-oriented  data  model,  i.e.  a  data  model 
based  on  the  notions  of  objects  with  values,  definable  behavior  and  identity, 
encapsulation,  classes,  and  class  inheritance.  Compared  with  the  relational  model, 
object-orientation  allows  for  a  much  more  extensive  and  accurate  modeling  of  real 
world  information.  This  is  due  to  the  support  of  complex  values  (using  constructors 
like  tuple,  set,  list,  etc.,  which  may  be  applied  recursively),  object  structures  (to 
express  various  foms  of  object  associations),  and  user-defined  operations  on  each 
class  (to  express  behavior  beyond  the  simple  retrieval  and  update  of  data). 

As  access  control  for  databases  obviously  has  to  reflect  the  data  units  handled  by 
the  respective  data  model,  known  database  access  control  features  at  least  have  fc 
be  adjusted  to  the  notion  of  "object".  It  shows  that  this  is  not  that  easy  as  migh 
seem  at  first  glance,  particularly  due  to  object  structures  and  their  variety  ci" 
semantics.  On  the  other  hand,  if  we  can  model  the  real  world  more  precisely  thank? 
to  object-orientation,  we  should  also  be  able  to  fine-tune  access  control  rules  much 
better  than  otherwise.  In  particular,  we  can  now  much  more  exactly  differentiate 
between  the  object  operations  to  be  allowed  or  denied  for  individual  users,  £? 
those  may  carry  much  more  real  world  semantics  than  mere  "read",  "write"  and 
similar  ones. 
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Active  DBMS 


Active  DBMS  allow  -  beyond  providing  all  regular  DBMS-features  -  the  recognition 
of  user-defined  situations  in  the  database  and  beyond,  and  the  execution  of  user- 
defined  reactions  when  such  a  situation  occurs.  The  popular  specification 
paradigm  for  active  DBMS  are  so-called  event/condition/action  rules  (ECA-rules) 

on  <event>  if  <condition>  dft  <action> 

which  can  be  defined  in  addition  to  the  regular  database  schema.  When  an  event  is 
detected  by  the  system,  the  condition  is  checked  on  the  database  and  if  it  holds,  the 
specified  action  is  executed.  Many  details  have  to  be  considered  in  such  systems, 
including  e.g.  the  kinds  of  events,  conditions  and  events  that  are  supported  and  the 
execution  of  rules  within  the  given  transaction  model.  In  particular,  the  power  of 
active  DBMS  is  highly  dependent  on  the  event  model  which  may  include  the 
occurrence  of  database  operations,  but  also  externally  raised  events,  time  events, 
and  various  combinations  thereof. 

Obviously,  active  rules  and  their  execution  have  to  be  subject  to  security,  too.  In  this 
respect,  a  rule  and  its  constituent  parts  can  be  regarded  as  database  elements,  but 
will  most  probably  still  require  particular  treatment  with  regard  to  access  control. 
Furthermore,  it  turns  out  that  subtle  points  may  arise  as  to  which  access  rights  have 
to  be  applied  when  a  reaction  is  executed  as  a  consequence  of  an  event  triggered 
by  some  other  action  (running  on  behalf  of  a  particular  user).  On  the  other  hand, 
ECA-rules  also  allow  for  the  flexible  and  dynamic  specification  of  rather  sophistica¬ 
ted  security  policies,  either  by  direct  use  or  by  acting  as  a  target  mechanism  for 
some  sort  of  security  specification  language. 


Federated  DBMS 

A  federated  DBMS  provides  for  the  interoperation  of  (probably  heterogeneous) 
component  DBMS  under  one  "common  roof".  Such  federations  are  mainly  concei¬ 
ved  to  facilitate  the  integration  of  existing  "information  islands",  but  may  also  help  to 
allow  the  introduction  of  e.g.  an  object-oriented  DBMS  in  an  environment  where  a 
more  traditional  DBMS  is  already  in  use  (affectionately  called  "legacy  systems") 
and  where  both  kinds  of  system  have  to  interoperate.  In  this  case,  the  autonomy  of 
component  systems  is  an  important  issue. 

Once  again,  there  are  two  sides  of  the  coin  as  far  as  security  is  concerned.  Where¬ 
as  especially  component  autonomy  raises  various  problems  when  it  comes  to 
authorization  and  access  control,  we  can  -  under  a  number  of  assumptions  -  even 
hope  to  retrofit  more  elaborate  security  features  to  "poor"  component  DBMS  provi¬ 
ded  they  are  operated  through  the  federal  layer  of  the  system. 


DBMS  construction 

Given  the  increasing  number  and  complexity  of  DBMS  that  may  be  required,  the 
DBMS  community  gets  more  and  more  interested  in  how  to  efficiently  build  DBMS, 
without  having  to  start  from  scratch  all  the  time.  Obviously,  this  means  to  apply 
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state-of-the-art  software  engineering  principles  -  an  issue  that  by  the  way  has  beer 
largely  neglected  in  the  past.  As  a  result,  extensible  DBMS  and  DBMS  construction 
systems  exploiting  configuration  and  generation  techniques  have  been  suggested 

In  particular,  it  seems  to  be  promising  to  regard  a  DBMS  as  a  collection  of  resourcs 
managers  that  cooperate  under  the  control  of  some  sort  of  "broker".  In  such  a  view, 
it  has  to  be  determined  where  various  security  mechanisms  fit  in  to  meet  ail  require¬ 
ments.  Though  work  in  this  direction  is  still  in  its  infancy,  it  can  be  expected  that 
some  basic  security  functionality  will  be  required  in  the  broker,  while  most  others 
can  be  nicely  organized  into  a  variety  of  "security  managers". 


Security  design 

Allowing  for  more  comprehensive  modeling  of  the  real  world  by  means  of  data  mo 
dels,  active  rules  etc.  will  improve  the  quality  and  maintainability  of  software 
systems  and  also  improve  the  efficiency  of  the  software  development  process. 
However,  as  real  world  systems  we  want  to  automate  are  usually  rather  complex, 
the  use  of  advanced  modeling  facilities  is  unfortunately  complex,  too.  In 
consequence,  it  is  also  often  everything  but  straightforward  to  apply  advanced 
security  features  in  the  appropriate  way.  As  a  consequence,  it  is  not  sufficient 
(though  very  important!)  to  provide  the  technical  means  for  powerful  and  effective 
security  mechanisms.  In  addition,  security  administrators  have  to  be  helped  in  their 
job  by  appropriate  design  methodologies  and  tools  which  allow  them  to  formulate 
their  requirements  and  map  these  systematically  to  the  relevant  mechanisms. 


In  summary,  current  trends  in  database  technology  indeed  do  have  considerable 
impact  on  security  concepts,  in  terms  of  both,  better  solutions  that  can  be  supported 
and  new  problems  that  need  to  be  solved.  Unfortunately,  commercial  products  in 
this  area  are  -  once  again!  -  very  slow  to  incorporate  security  features  that  are  as 
advanced  as  the  rest  of  the  system  from  the  very  beginning.  At  best,  they  are  going 
to  retrofit  them  to  the  system  in  later  releases.  The  security  community  is  thus 
challenged  not  only  to  device  and  evaluate  appropriate  concepts,  but  also  to  push 
for  and  foster  the  necessary  technology  transfer  to  DBMS  builders  and  users. 
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Abstract 

This  paper  examines  the  concept  of  role-based  protection  and,  in  particular,  role 
organization.  From  basic  role  relationships,  a  model  for  role  organization  is  developed. 
The  role  graph  model,  its  operator  semantics  based  on  graph  theory  and  algorithms  for 
role  administration  are  proposed.  The  role  graph  model,  in  our  view,  presents  a  very 
generalized  form  of  role  organization  for  access  rights  administration.  It  is  shown  how 
the  model  simulates  other  organizational  structures  such  as  hierarchies  [TDH92]  and 
privilege  graphs  [Bal90]. 
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1  Introduction 

Role-based  protection  is  a  flexible  means  of  administering  large  numbers  of  system  privileges 
especially  for  large  databases.  A  privilege  is  a  unit  of  access  to  system  information.  A  role 
is  a  named  collection  of  such  privileges  [Bal90,  KM92,  N093b].  User  authorization  to  a  role 
grants  the  user  access  to  the  privileges  defined  in  the  role. 

The  advantage  of  role-based  protection  is  that  it  eases  the  administration  of  privileges 
because  of  the  flexibility  with  which  roles  may  be  configured  and  reconfigured  [TDH92, 
N093b].  System  security  is  served  further  when  the  role  configuration  process  is  based  on 
the  principle  of  least  privilege  in  which  a  role  is  equipped  only  with  sufficient  privileges  to 
facilitate  the  intended  duty  requirements  [Tho91]. 

In  an  organization  with  a  large  number  of  diverse  duty  requirements,  the  number  of 
roles  can  proliferate  as  new  roles  are  defined  to  meet  specific  duty  requirements.  Some  roles 
can  have  overlapping  functions  (hence  overlapping  privileges)  while  others  need  not  overlap. 
The  need  to  have  some  formal  manner  of  tracking  the  distribution  and  administration  of 
privileges  is  important  to  ensure  proper  exercise  of  both  responsibility  and  system  security. 
It  is  important  to  have  a  means  of  formally  expressing  role  relationships  -  one  which  reflects 
the  manner  of  distribution  of  privileges  in  a  system. 

This  paper  examines  what  we  consider  basic  relationships  that  can  exist  among  roles 
in  an  organization  and  their  application  in  modeling  role  organization.  Using  these  basic 
relationships  as  the  foundation,  a  model  for  role  organization  is  proposed.  It  is  possible  for 
the  privilege  sets  of  two  roles  to  completely  overlap  (one  is  a  subset  of  the  other),  partially 
overlap  (have  a  common  subset)  or  have  a  common  superset.  These  relationships,  along 
with  the  concepts  of  maximum  and  minimum  privilege  sets  form  the  basis  of  the  role  graph 
model.  To  demonstrate  the  expressive  power  of  this  model,  we  illustrate  how  it  simulates 
organizational  structures  such  as  hierarchies  [TDH92]  and  privilege  graphs  [Bal90]. 

'To  Appear  in  Database  Security  VIII:  Status  &  Prospects,  August,  1994. 
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In  the  next  section  we  discuss  the  concepts  of  privileges,  roles  and  the  advantages  of 
role-hased  protection.  We  formally  define  the  term  role,  as  used  in  this  paper,  and  motivate 
the  need  for  formal  role  organization.  In  section  3  we  discuss  the  basic  relationships  that 
can  exist  among  roles  and  introduce  operators  to  model  these  relationships.  We  regard  these 
relationships  as  forming  the  basis  of  role  organization  modeling.  Section  4  formally  presents 
the  role  graph  model,  and  gives  algorithms  for  role  administration.  Section  5  discusses 
model  simulation  of  other  role  organizational  structures.  Section  6  contains  the  summary 
and  conclusions. 

2  Introduction  to  Roles 

2.1  Basic  Definitions 

The  idea  of  a  role  arises  out  of  the  need  to  provide  duty  functionality  which  is  then  autho¬ 
rized  as  a  single  unit.  A  role  can  be  seen  as  a  job,  office,  set  of  actions  of  a  role-holder, 
a  collection  of  responsibilities  and  functions  or  a  collection  of  privileges  pertaining  to  some 
duty  requirements  [DM89,  Bal90].  A  role  exists  as  an  entity  separate  from  the  role  holder  or 
role  administrator.  It  should  be  equipped  with  sufficient  functionality  to  enable  an  autho¬ 
rized  user  to  achieve  the  duty  requirements  associated  with  the  role.  Hence  a  clerical  role 
will  be  given  sufficient  access  rights  to  enable  an  authorized  user,  or  user  group,  to  perform 
clerical  duties.  Baldwin  [Bal90]  terms  these  Named  Protection  Domains  (NPDs).  Such 
a  role  specification  captures  the  responsibilities,  rights  and  obligations  associated  with  what 
Dobson  and  McDermid  [DM89]  term  a  functional  role. 

The  other  important  component  of  role  definition  is  its  structural  [DM89]  aspect  which 
captures  a  role’s  relationship  with  other  roles.  For  purposes  of  this  paper,  we  shall  use  the 
term  role  to  refer  to  the  functional  aspect  while  the  structural  aspect  of  role  relationshii;-;- 
will  be  captured  by  the  structure  defining  (  heir  relationships -in  our  case  a  role  graph  model. 

A  role  is  defined  in  terms  of  privileges.  A  privilege,  on  the  other  hand,  is  defined  in  terms 
of  access  modes  and  can  be  viewed  as  a,  unit  of  access  rights  administration. 

Definition  1  Privilege:  A  privilege  is  a  pair  (x,  m  )  where  x  refers  to  an  object  and  ni  is  a 
non-empty  set  of  access  modes  for  x.  n 

The  object  referred  to  by  x  can  be  a  protected  data  item,  an  object-oriented  (0-0)  class 
definition  or  extent,  a  complex  object,  a  resource  (e.g.  printer),  etc.  x  can  be  any  name 
or  identifier  which  uniquely  specifies  the  associated  object,  m,  the  set  of  access  modes,  is 
composed  of  valid  modes  of  access  to  x.  Its  specification  and  administration  can  be  subjected 
to  a  tango  of  security  policies.  In  .systems  with  simple  access  modes  such  as  reads,  writes, 
executes,  etc.  rn.  is  a  subset  of  t.hese  access  modes.  In  complex  systems,  these  access  modes 
can  be  composed  of  a  series  of  or  nested  applications  of  reads,  writes  and  executes.  Where  x  is 
an  object  in  an  0-0  environment ,  m  would  be  the  execute  mode  of  one  or  more  methods.  In 
transactional  systems,  m  would  be  a  list  of  transactions  that  facilitate  access  to  x.  The  exact 
nature  of  x  and  m  is  a  matter  of  the  application  environment  and  the  associated  security 
policy  [N093a].  Since  privileges  are  intended  for  security  administration,  the  security  policy 
must  specify  how  they  are  administered.  In  our  case,  the  initialization  and  modification  of  a 
privilege  must  be  authorized. 

Definition  2  Role:  A  role  is  a  named  set  of  privileges.  It  is  a  pair  (rname,  rpset)  where 
rname  is  the  role  name  and  rpset  is  the  privilege  set.  □ 

A  role’s  name  rname  uniquely  identifies  a  role  in  a  system.  We  use  dot  notation  to 
refer  to  a  role’s  name  and  privilege  set.  Thus  for  a  given  role  r,  r.rname  and  .r  rpset  refej- 
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to  the  name  of  the  role  and  its  privilege  set,  respectively.  Let  VV  denote  the  uiiivarsal  set 
of  privileges  in  a  given  system,  and  Tl  the  universal  set  of  roles.  We  also  define  a  function 
:  7^  1-^  VV,  which  enumerates  the  privileges  of  a  given  role,  so  that  for  every  r  e  7c,  §(r)  = 
{pvi,  ■  ■  ■  ,pvn}  =  r.rpset. 


2.2  Strengths  of  Role-Based  Protection 

Role-based  protection  offers  flexibility  in  system  privilege  administration  [TDH92,  N093b]. 
User  access  rights  can  be  varied  either  by  explicit  authorization  (or  revocation  of  authoriza¬ 
tion)  of  a  user  to  a  role  or  by  indirectly  varying  the  role  privilege  set.  Further  advantage 
is  gained  if  users  are  organized  into  groups  such  that  authorizations  are  given  to  groups,  as 
opposed  to  individuals. 

Given  that  system  privileges  can  be  very  fine-grained,  roles  offer  a  means  of  managing 
them,  incrementally.  Considering  the  manner  in  which  privileges  can  be  assigned/' revoked 
to/from  a  given  role,  this  method  approaches  a  continuum  in  system  privilege  administration 
[N09.3b].  A  related  advantage  is  that  role-based  protection  can  be  used  to  enforce  the 
principle  of  least  privilege  where  a  role  is  defined  to  have  only  the  necessary  functionality 
required  for  the  associated  duties  [Tho9l]. 

This  approach  offers  a  simplification  of  the  complexity  of  system  privilege  management. 
With  a  suitable  organizational  framework  capturing  role  relationships,  it  is  possible  to  analyze 
the  implications  of  given  authorizations.  Moreover,  such  a  formal  framework  lends  itself  to 
the  development  of  analytical  tools.  It  is  also  possible  that  management  tools  for  access 
rights  administration  can  be  used  in  role  management. 

Given  that  role-based  protection  is  designed  with  a  given  application  in  mind,  this  method 
provides  a  chance  for  incorporation  of  application  level  security  constraints  and  semantics 
[Tho91].  An  associated  advantage  is  that  roles  allow  for  multidirectional  mfoTmsition  flow 
policies  [Tho91]  unlike  such  models  as  Denning’s  role  graph  model  [Den76]  and  Bell  and 
LaPadula’s  [BL75]  multilevel  model.  As  well,  unlike  these  traditional  models  which  specify 
what  information  flows  should  not  take  place,  role-based  protection  affirms  which  information 
flows  can  take  place  [GMP92]. 

2.3  Roles  &  Access  Rights  Administration 

Roles  act  as  gateways  to  system  information.  The  privilege  set  of  a  given  role  determines 
what  information  is  available  via  the  role.  One  advantage  of  role-based  protection  mentioned 
in  the  previous  section  is  that  access  to  system  information  is  accomplished  at  two  levels: 
via  explicit  authorization  to  a  role  or  via  inclusion  of  some  privilege  in  a  role.  We  term 
the  former  user-role  authorization  while  the  latter  is  termed  role-privilege  authorization  (see 
figure  1).  A  third  form  of  authorization  is  role-role  authorization  [Bal90]  in  which  one  role  is 
authorized  another’s  privileges.  We  address  each  of  these  in  turn. 

In  user-role  authorization,  a  user/group  is  authorized  access  to  system  privileges  avail¬ 
able  via  the  role.  Such  authorization  must  be  specified  in  a  role’s  access  control  list.  For 
each  role,  such  an  access  control  list  contains  the  user  identifier  for  each  user  authorized  to 
the  role. 

Let  UTV  be  the  set  of  all  user  identifiers,  and  OXV  the  set  of  all  group  identifiers; 

IV  =  ujv  u  giv. 

Definition  3  Access  Control  List:  A  role  access  con  trol  list  ( rad )  is  of  the  form:  [id\ ,  •  •  • ,  idn], 
where  idi  6  ID.  □ 
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Figure  1:  Tlireo  Kinds  of  Authorizations 


III  a  surure  system  all  roles  must  have  access  control  lists,  i.e.  Vr  G  7?,,  3  r.racl  = 
[•  ■  •  .id-i.  ■  ■  ■].  A  role  with  an  associated  access  control  list  is  called  a  secure  role. 

Definition  4  Secure  Role:  A  secure  role  is  a  named  collection  of  privileges  along  with  its 
access  control  list.  It  is  a  triple  (rnamc.rpset.rad),  where  rname  is  the  role  name,  rpset  is 
its  privilege  set  and  racl  is  its  access  control  list.  □ 

Role-privilege  authorization  involves  role  configuration  in  which  a  privilege  is  added  to 
the  role's  privilege  set.  Role- role  authorization  [Bal90]  forms  the  third  kind  of  authorization. 
If  a  role  A  is  authorized  to  access  a  role  B.  ii  means  that  all  of  B’s  access  rights  are  available 
via  role  .4.  In  other  words,  B’s  privileges  are  a  proper  subset  of  the  privileges  of  A.  Role-role 
authorization  is  an  aspect  of  role  structure. 

Example  1  Suppose  we  have  two  roles:  clerk  and  supervisor  in  which  the  supervisor 
role  has  a  role  authorization  to  the  clerk  role.  This  means  that  the  clerk’s  access  rights  are 
available  to  the  supervisor.  A  user  authorized  to  the  supervisor  role  can  perform  whatever  a 
user  authorized  to  the  clerk  role  can  do.“  We  can  view  the  privilege  relationships  between  the 
two  roles  as  'i'{cl.crk)  C  ^{suprrvl.'^.or).  □ 

This  paper  examines  role-role  authorizations  which  define  role  relationships.  These  have 
implications  on  role  organization  and  access  rights  administration.  Role-role  authorizations 
ca.n  be  complex.  lo  rapture  the  role-relationships  completely  and  be  able  to  carry  out  an 
analysis  of  the  implications  of  privilege  assignment  and  distribution  in  a  system  can  be  very 
complex  wit.hout  some  formal  organizational  structure.  Complexity  of  analysis  of  system 
privilege  distribution  is  one  short-coming  of  role-based  protection  [TDH92,  N09.3b]. 

Baldwins  approach  to  access  rignts  administration  uses  privilege  gra.phs  (PG)  which 
capture  functionality,  structure  and  autnorizations.  A  PG  (figure  2)  is  an  acyclic  graph  with 
three  types  of  nodes:  functionality,  role  a.nd  user/group.  A  path  from  a  given  user  node 
to  a  functionality  node  means  that  the  user  is  authorized  to  execute  the  functionality.  The 

Separation  of  duty  [CW87],  on  the  other  hand,  ensure.?  that  the  supervisor  does  not  perform  both  roles. 


Users/  Groups 


Roles 


Functionality 


Figure  2:  Baldwin’s  Privilege  Graph 


access  rights  available  to  such  a  user  are  all  the  privileges  specified  in  roles  on  any  such  path, 
ling  et..  al.’s  [TDH92]  approach  utilizes  hierarchical  ordering  of  roles  in  which  for  any  given 
roles  in  a  path,  those  lower  in  the  hierarchy  have  lower  functionality  than  those  high  in  the 
hierarchy.  In  general,  the  path  captures  a  subsetting  relationship  between  the  roles  such  that 
for  a  given  directed  edge  {vi,Vj),  '3’(uj)  C  Both  of  these  structures  have  what  we  term 

the  acyclicity  property. 

Definition  5  Acyclicity  Property:  A  role  organization  structure  is  said  to  have  the  acyclicity 
property  if  in  a  graph  of  the  role  relationships,  with  the  roles  as  nodes,  we  have  a  directed 
edge  (r,;.r,)  whenever  $(r,)  C  'F(7-,)  and  the  graph  is  acyclic.  □ 

Property  1  Role  Organization  5'tructure  Acyclicity:  A  role  organization  must  preserve  the 
acyclicity  property  in  order  to  offer  differentiated  access  to  system  information  via  role-based 
protection  techniques.  □ 

3  Modeling  Role  Organization 

A  role  is  a  collection  of  privileges  which  facilitates  the  execution  of  some  functionality  for 
an  authorized  u.ser.  Roles  in  a  system  can  have  different  kinds  of  relationships  among  them 
based  on  their  associated  functionalities  and  organizational  constraints.  Thus  it  is  important 
to  develop  some  formal  organizational  framework  which  expresses  desirable  properties  for  an 
enterprise  whose  security  is  being  enforced  and,  in  the  process,  captures  the  relationships 
among  roles.  Such  a  framework  will  facilitate  the  analysis  of  privilege  distribution  and 
sharing. 

In  this  section,  we  discuss  and  model  basic  role  relationships  which  form  the  basis  of  a 
role  organization  framework.  We  start  with  relationships  between  two  roles  and  introduce 
the  concepts  of  the  minimum  and  maximum  privilege  sets  in  a  role-based  system  and  their 
relationship  with  other  roles.  Finally,  we  combine  these  concepts  to  yield  a  framework  for 
role  organization. 
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Figure  3:  Three  Kinds  of  Basic  Role 
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3.1  Basic  Role  Relationships 

We  identify  three  kinds  of  basic  relationships:  junior- senior,  common  “junior"’  and  common 
“senior'’.  The  junior-senior  relationship,  expressed  as  junior-+senior,  captures  the  fact 
that  the  senior  role’s  privileges  include  those  of  the  junior  one.  A  role  is  a  common  junior 
of  two  other  roles  if  it  shares  some  privileges  with  both  of  these  senior  roles.  A  role  which 
encompasses  all  of  the  privileges  of  two  junior  roles  is  railed  a  common  senior  to  these  roles. 
Figure  3  shows  these  three  possibilities  with  Venn  diagrams  over  the  associated  privileges. 
In  all  cases,  there  is  privilege  and  functionality  sharing  between  two  roles. 

1.  Partial  Privileges 

With  j)artial  privilege  sharing,  privileges  defined  in  one  role  are  a  complete  subset  of 
privileges  in  another  role.  This  implies  shared  functionality  via  the  shared  privileges, 
for  instance,  the  clerk  and  supervisor  roles  in  example  1  share  the  functionality 
associated  with  the  clerk  role,  i.e.  a  user  authorized  to  the  supervisor  role  can 
execute  the  functionalities  associated  with  both  roles  (figure  3a). 

We  model  such  direct  functionality  and  privilege  sharing  using  the  is-junior  relation¬ 
ship  dmioted  by  In  our  example,  clerk— ^supervisor.  In  general,  given  two  roles 

G  Tv.  with  7-j  Cj.  we  hav('  the  following  interpretation: 

r,  and  r,  arc  “junior  (suhsorvient)  and  “senior"  (superior)  roles,  respectively. 
Moreover,  r,  ’s  privileges  and  functionality  are  available  to  Tj.  Hence  'l>(r,:)  C 
'h(r,).  We  .say  r,’s  privileges  are  indirectly  available  to  r,. 


Definition  6  is-junior  relationship  An  is-junior  relationship  exists  between  two 
roles  r,  and  r.,,  denoted  r,  —  r,,  if  and  only  if  '^{  r,)  C  W(rj).  □ 

The  z.s-fi/nior  relationship  can  be  seen  as  a  role-role  authorization  in  w'hich  the  superior 
role  is  authorized  to  the  privileges  of  the  junior  role. 

If  we  consider  relative  authority  as  a  measure  of  the  privileges  associated  with  a  role, 
then  the  is-junior  relationship  can  be  seen  as  specifying  which  of  the  two  roles  has  a. 
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higher  authority  than  the  other.  In  our  case  the  junior  role  exercises  less  authority 
than  the  superior  one.  Moreover,  the  is-junior  relationship  can  be  seen  as  specifying 
the  flow  of  authority  in  which  the  senior  role  exercises  more  authority  than  the  junior 
one.  Further,  for  this  authority  to  be  meaningful,  this  relationship  must  be  acyclic^  it 
must  preserve  property  1. 

2.  Common  Privileges 

Another  form  of  relationship  between  two  roles  is  where  there  is  privilege  sharing  in 
which  roles  have  a  non-empty  intersection  of  their  privilege  sets  but  with  neither  of 
the  sets  being  a  subset  nor  a  superset  of  the  other.  Such  a  relationship  can  be  used  to 
express  an  overlap  of  responsibility  (figure  3b). 

If  there  exists  a  role  defined  whose  privilege  set  is  some  or  ail  of  this  intersection,  then 
we  say  such  a  role  is  a  common-junior  of  the  other  two  roles.  We  denote  the  com, mon¬ 
junior  relationship  by  “0”-  In  general,  r,  9  r.  is  not  unique.  Suppose  we  have  roles  A, 
B  and  C  related  as  C  G  A  0  B.  Suppose  the  privilege  sets  associated  with  A  and  B  are 
'1'(.4)={1,2,3,4}  and  '${B)  ={3,4, 5, 6, 7},  respectively.  5’(C)  must  be  a  common  subset 
of  both  ^(A)  and  ^'(5),  i.e.  "FfC)  C  ($(  A)  n  =  {3,4}. 

In  general,  given  three  roles  ri,  G  7?  and  G  r,-  ©  rj,  we  have  the  following 

interpretation; 

both  r,  and  r,  are  senior  (superior)  roles  to  Tk-  Moreover,  Vk’s  privileges  and 
functionality  are  indirectly  available  to  both  Ti  and  rj.  Hence  C  $(r,) 

and  C 

Definition  7  common-junior  relationship  (0);  Given  roles  riandrj,  ri  0  rj  is  all 
such  that  C  ('P(r,)  n  \I'(rj)).  □ 

3.  Privilege  Augmentation 

Another  important  consideration  is  privilege  augmentation.  In  analyzing  privilege  dis¬ 
tribution  it  may  be  necessary  to  find  a  role  that  embodies  the  functionality  and  privi¬ 
leges  of  two  given  roles.  Such  a  role’s  privileges  will  be  a  superset  of  both  given  roles 
(figure  3r). 

The  relationship  in  such  a  case  is  termed  common-senior  and  is  denoted  by  “0”.  In 
general,  r,;  0r,  is  not  unique.  Suppose  we  have  roles  X,  Y  and  Z  related  as  Z  E  X  (BY . 
Let  'F(A'  )  =  {].2,3,4}  and  'h(V')=:{6,7,8,9}.  For  Z’s  privileges  to  be  a  common  superset 
of  those  of  X  and  Y,  we  must  have  ('1>(A')  U  ^’(Y))  C  $(Z),  i.e.  {1,2,3, 4, 6, 7,8, 9} 

Cliven  three  roles  r,,rj,r;.  G  and  r*,.  G  r,  0  r, ,  we  have  the  following  interpretation: 

both  Vi  and  rj  are  junior  (subservient)  roles  to  r^.  Moreover,  both  r^’s  and 
Vj 's  privileges  and  functionalities  are  indirectly  available  to  r/..  Hence  4’(rj)  C 
T'lr'i,)  and  $(r'j)  C  ?’(rfc). 


Definition  8  common-senior  relationship  (0);  Given  roles  ri  and  rj,  ri  0  rj  is  all  r^ 
such  that  ()^(r,)  U  C  'I’(r*:).  □ 

The  foregoing  relationships  can  be  extended  to  cater  for  more  than  two  roles. 
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1.  Partial  Privilege  Sharing 

From  the  definition  of  the  is-junior  relationship,  if  (r,-  -s-  Tj)  and  [rj  -»■  rj. )  then  it  must 
also  be  true  that  r,-  Vk  since  (^(r,)  C  ^{rj))  A  (W(rj)  C  ^  C  $(rfc)). 

This  then  captures  the  transitive  property  of  the  is-junior  relationship.  In  general,  if 

we  have  a  role  relationship  of  the  form:  rt  r,+i  -> - .  ri+n,n  >  0,  it  follows  that 

t(r,)  C  '^(ri+i)  C  ■  •  •  C  This  captures  the  monotonic  increasing  property  of 

the  privilege  function  for  roles  related  via  the  is-junior  relationship. 

Property  2  The  privilege  function  increases  monotonically  with  respect  to  the  is- 
junior  re/ationsfi/p.  □ 

We  denote  r,  n^i  ^  n+n  -*  r-j  by  n  — *  Vj  for  n  >  0  and  r,  rj  for  n  >  0. 

This  leads  to  the  concept  of  a  path: 

Definition  9  Role  Path:  A  role  path,  p,  between  two  roles  r,-  and  is  of  the  form 
'f'l  fiy  A  trivial  path  exists  between  a  role  and  itself.  □ 

Other  properties  of  the  is-junior  relationship  include  reflexivity  and  antisymmetry. 
(liven  roles  r,  and  r^,  we  have  a  rv  (reflexivity)  since  $(r,)  C  ^(r,).  As  well, 
{(r,  rj)  A  (cj  -+  r,))  r,  =  r^.  This  follows  from  the  observation  that  (tj  ^  rj) 

'l'(n)  C  ^(r-,)  and  (t-,  -  r, )  C  ^(r,).  With  ^(r,)  C  $(r/)  and  $(r_,)  C 

and  by  the  acyclicity  property,  it  follows  that  $(r,)  =  fl'(rj),  which  implies  r,  =  Tj. 
Phis  is  the  basis  of  the  following  property: 

Property  3  Role  Privilege  Set  Uniqueness:  .A  Role’s  privilege  set  must  be  unique.  □ 
2.  Common  Privileges 

from  the  common-junior  ('-i)  relationship  above,  observe  that  the  common  subset  of 
two  roles  need  not  be  aii  immediate  junior  role  of  both  roles  in  question.  The  fol¬ 
lowing  lemma  expresses  the  relationship  between  the  is-junior  and  the  common-junior 
operators.  —  and  0,  respectively: 

Lemma  1  If  rf;  g  r^  0  r,.  then  Vk  — r^  and  rj.  □ 

1  he  common- junior operai or  (he)  is  commutative,  associative  and  reflexive:  i.e.  ViQrj  = 
r,  ••  r,  .  r,  ;•  (r,  0  r^)  —  ( r,  i-;  r,)  o)  r/.  and  r,  0  ?•,  is  defined  and  includes  a. 

fl.  Privilege  Augmentation 

As  with  the  common-jM7?707’ relationship,  the  common-senior  reXaixonsbip  need  not  in¬ 
volve  immediate  superiors  of  the  role  under  consideration.  The  following  lemma  cap- 
nires  the  relationship  between  the  two  operators  --  and  0: 

Lemma  2  If  Vk  e  r,  0  r,.  then  r,  r^  and  r,  — +  r*,.  □ 

The  operation  0  is  commutative,  associative  and  reflexive  i.e.  Vi  0  rj  =  Vj  0  r; , 
h  (C;  0  rj.)  =  (r^  0  Vj)  i.|i  ?■;.  and  r,  qi  r,  is  defined  and  includes 
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3.2  The  Concepts  of  Minimum  and  Maximum  Privilege  Sets 

It  is  possible  that  an  organization  provides  a  minimum  set  of  privileges  available  to  every 
user.  Such  a  basic  privilege  set,  for  instance,  can  be  things  hke  the  ability /permission  to  log 
onto  a  computer  system,  the  privilege  to  get  into  certain  areas  of  an  organization’s  premises, 
etc.  In  general,  this  minimum  privilege  set  represents  the  very  minimum  that  any  valid  user 
can  be  authorized  to. 

Since  users  are  authorized  to  specific  roles,  it  is  possible  to  organize  such  a  basic  set 
of  privileges  into  a  role  such  that  they  are  available  via  explicit  authorization  or  via  role 
relationships  with  other  roles.  We  denote  the  role  with  the  basic  privilege  set  MinRole.  In 
general,  depending  on  a  particular  organization,  MinRoIe’s  privilege  set  can  be  empty. 


MinRole)  = 


Minimum  mandatory  privilege  set  if  defined 
0  otherwise 


For  all  r  £  7?.,  MinRole  — r  holds. 


Property  4  Minimum  Privilege  Property:  MinRole  is  always  defined.  □ 

With  the  introduction  of  MinRole,  there  is  always  at  least  one  common- junior  ioi  aU 
roles,  namely  MinRole. 

As  with  MinRole,  we  envisage  MaxRole,  some  system  “chief  executive”  role,  which 
embodies  the  collection  of  all  privileges  in  a.  given  system.  Theoretically,  a  user  authorized 
to  MaxRole  can  execute  any  functionality  using  the  associated  privileges  in  whatever  role 
they  are  specified.  Unlike  ^(MinRole)  which  can  be  empty,  'F(MaxRole)  can  never  be  empty 
if  the  system  is  intended  to  accomplish  anything  at  all. 

'F(MaxRole)  =  (J  $'(r) 
reTi 

For  all  r  £  72,r  -++MaxRole  holds. 


Property  5  Maximum  Privilege  Property:  MaxRole  is  always  defined.  □ 

With  the  introduction  of  MaxRole,  there  is  always  at  least  one  common-senior  for  two 
roles,  namely  MaxRole. 

The  i.s-junior,  common-junior  and  common-senior  relationships  introduced  in  the  previ¬ 
ous  section  capture  all  manner  of  relationships  that  can  be  used  to  associate  two  or  more 
roles  when  there  is  need  for  analysis  of  their  interaction.  MinRole  and  MaxRole  express 
the  concepts  of  minimum  mandatory  and  maximum  privilege  sets,  respectively,  in  a  system. 
(A)mbining  these  yields  representations  such  as  those  in  figure  4. 

For  the  purposes  of  security  and  the  need  for  dispersion  of  powers,  MaxRole  may  not 
be  authorized  to  any  one  individual  in  an  organization.  In  an  ideal  situation,  MaxRole 
conceptually  corresponds  with  the  role  of  a  Chief  Executive  in  an  organization.  It  is  unlikely 
that  an  administrative  or  a  security  policy  would  advocate  such  singular  exercise  of  powers. 
Moreover,  there  is  a  very  realistic  risk  that  allowing  exercise  of  privileges  of  MaxRole  can 
compromise  the  system.  However,  such  problems  need  not  arise  if  we  make  the  exception 
that  no  single  user  can  exercise  the  privileges  of  MaxRole.  This  will  make  MaxRole  a  non¬ 
executable  role.  Other  policies  may  choose  a  collective  execution  of  the  role,  e.g.  by  a  number 
of  votes  of  authorized  users.  Whatever  the  case,  authorization  to  MaxRole  with  be  a  matter 
of  a  specific  security  policy.  MaxRole,  in  our  modeling,  is  useful  for  purposes  of  completeness. 
It  ensures  that  every  two  roles  in  the  system  have  a  common-senior  just  as  MinRole  ensures 
that  every  two  roles  have  a  common-junior. 
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Figure  4;  Different  Forms  of  Role  Organization 


4  A  Role  Graph  Model  for  Role  Organization 

The  basic  role  relationships  discussed  in  section  3  point  to  an  acyclic,  role  graph  organization 
for  roles.  In  this  section  we  develop  the  modeling  further  using  graph  theory.  We  present  a 
role  graph  model  for  role  organization  and  develop  algorithms  for  the  management  of  roles 
and  their  relationships. 

4.1  The  Model:  Informally 

lo  minimize  the  task  of  enumerating  the  privileges  of  each  role,  we  organize  them  usim: 
the  concepts  introduced  in  section  3  which  incorporate  acyclicity  of  the  role  graph  structure 
and  the  monotonicity  of  role  privileges  for  any  path.  Such  a  structure,  along  with  rules  for 
role  ordering  and  determining  the  privileges  associated  with  a  role,  facilitate  a  simple,  yet 
elegant,  organization  of  roles  to  reflect  the  authoritif  attached  to  each  role.  Role  ordering 
and  role  inter-relationships,  in  turn,  offer  a  means  of  distributing  privileges  among  the  roles. 
The  idea,  is  that  we  explicitly  assign  a.  privilege  at  the  lowest  point  in  the  role  graph  where  it 
is  desirable.  Since  our  formulation  specifies  that  high  order  roles  can  execute  the  privileges 
of  the  lower  order  ones  with  a.  connecting  path,  we  can  make  the  least  number  of  explicit 
privilege  assignments  that  would  facilitate  the  de.sired  distribution. 

From  the  ordering,  we  define  authority  paths  i\mi  are  linear  (total)  orders  of  roles  accord¬ 
ing  to  increasing  authority,  connected  by  the  is-junior  (  —  )  relationship  which  can  be  seen 
to  be  specifying  the  flow  of  a.utlioril.y.  In  essence,  the  ordering  asserts  the  fact  that  higher 
a.uthonfy  roles  ha.ve  access  to  more  privileges  than  lower  ordered  ones  in  any  given  path. 
Ihe  effective  privileges  associated  with  a  role  result  from  those  privileges  dfreri/y  associated 
with  the  lole  and  those  in.directly  associated  with  it.  The  former  are  those  privileges  explic¬ 
itly  specified  in  the  role  while  the  latter  are  those  privileges  specified  in  lower  order  roles 
connected  by  a  path  to  the  role. 

4.2  The  Role  Graph  Model:  Formally 

T  his  section  presents  the  formal  organization  of  roles  into  a  role  graph  RG  =  (W,  ^),  as  shown 
in  figure  5.  The  nodes  of  the  graph  correspond  to  the  roles  given,  and  include  MaxRole  and 

’Oiir  use  of  this  term  will  become  clear  as  we  advance. 
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MinRoIe.  TZ  =  {ri,  r2,  •  •  • , r„, MaxRole,  MinRole}.  The  edges  are  defined  by  the  is- junior 
relationship.  Note  that  by  the  definition  of  privileges  for  MaxRole  and  MinRole  and  the 
definition  of  is-junior,  there  is  an  edge  from  MinRole  to  every  Vi,  and  an  edge  from  every 
r,'  to  MaxRole.  The  common-junior  and  common-senior  relationships,  (0  and  0)  still  have 
the  same  meaning  as  previously. 

Note  that  if  a  system  administrator  is  specifying  roles,  it  is  possible  that  the  privileges  are 
specified  in  a  highly  redundant  fashion.  In  other  words,  rather  than  specifying  the  minimum 
set  of  direct  privileges  for  a  role,  some  indirect  privileges  might  be  given  as  being  direct 
privileges.  The  function  $(r)  returns  the  set  of  all  direct  and  indirect  privileges  of  a  role, 
which  we  also  call  the  ejfectwe  privileges.  The  version  of  the  graph  which  we  will  present  to 
the  role  administrator  should  neither  have  redundant  privilege  specifications  nor  redundant 
is-junior  relationships  (i.e.  redundant  graph  edges),  in  order  to  highlight  the  true  nature  of 
the  role  relationships.  We  will  further  explain  this  reduced  form  of  the  graph  shortly. 

Paths  in  the  role  graph  not  involving  MaxRole  and  MinRole  are  of  more  interest  to 
us.  Consequently,  we  shall  use  the  following  role  graph  path  definition  in  the  subsequent 
sections. 

Definition  10  Role  Graph  Path:  A  role  graph  path,  p,  is  of  the  form  Vi  r,_|_]  — +  •  •  •  ^ 
r,+ri.  — ‘  0,  n  >  0  such  that  r,  MinRole  A  MaxRole.  □ 

The  quadruple  (Tv, -^,0,0)  which  includes  MaxRole  and  MinRole,  specifies  an  au¬ 
thority  structure  for  roles.  For  any  role  graph  path  of  the  form  r,-  r„,  n  >  1  we  have 

an  authority  relation  of  the  form  rq  <  •  ■  •  <  with  the  authority  embodied  in  the  roles  on 
a  path  totally  ordered.  In  general,  given  any  two  roles  ri,r2£7^,  <  7-2,  r2  <  ri  or  they  are 

incomparable.  Where  there  is  a  path  (call  it  an  authority  flow  path),  the  roles  in  the  path 
form  a  total  order. 

Definition  11  Path  Role  Set:  The  role  set  of  a  given  path,  denoted  by  r(p),  is  the  set  of  all 
roles  that  compose  the  path.  We  say  that  a  given  role  participates  in  a  path  if  it  belongs  to 
the  path’s  role  set.  □ 
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We  extend  the  function  ’3?  to  paths  as  follows:  for  a  path  p,  ^(r(p))  = 

Definition  12  Path  Independence:  Let  pi  and  pj  be  two  paths  in  a  role  graph.  We  say  pi  is 
independent  of  pj  if  W(r(pi))  n  ^(r(pj))  =  SS{MinRole).  □ 

In  other  words,  the  two  role  sets  are  related  only  via  MaxRole  and  MinRole.  Such 
independence  can  be  exploited  to  prohibit  privilege  sharing  by  ensuring  that  the  privilege 
sets  of  two  independent  paths  are  disjoint. 

Example  2  Consider  figure  6  where  we  have  distinct  privileges  numbered  1, •  ••  ,12  with  priv¬ 
ileges  I,- -,6  directly  assigned  to  roles  and  {7, 8},  {9, 10},  {11, 12}  assigned  roles 

G,H,1  respectively.  We  have  a  role  graph  specification  as  follows:  A  E,B  -*  E,C 
F,D  ^  G,E  {HJ},{F,G}  ->■  7,  MaxRole  =  ff  ©  7,  and  MinRole  =  AQ  B  Q  C  Q  D  with 
^'(MinRole)  =  0. 

From  this  we  can  compute  the  privileges  of  various  roles  and  obtain  the  privileges  distribution 
as  in  table  1.  Moreover,  we  have  the  following  relationships  relating  to  the  0,©,-^  operators; 

1.  The  common- jwnfor  operator,  0,  defines  a  common  subset  of  privileges  for  any  two  roles. 
Consider  E  £  77  0  7  and  note  that  Q  7)  =  $(7?)  =  {1,2,5}  =  ^(H)  n  ’^(7). 

2.  The  common- senior  operator,  ®,  defines  the  union  of  privileges  of  two  roles  and  as  such 
is  a  common  superset  for  any  two  roles.  Consider  7  £  7"  ©  G  and  note  that  $(7"  ®  G)  = 
$('E)U5'(G)  =  {3, 4, 6, 7,8}  C  $(7)  =  {3,4,6,7,8,11,12}. 

3.  The  is-junior  operator,  defines  a  proper  subset  relationship  between  two  roles,  e.g. 
E^  H .  Note  that  'i'(E)  =  {1,2,5}  C  {1,2, 5, 9, 10}.  This  is  true  for  all  roles  related  via 
the  is-junior  relationship. 

4.  Paths  A-*  E  ^  H  and  G  F  are  independent  paths  since  their  roles  sets  {A,E,H}  and 

{C,F}  are  mutually  exclusive  and  the  two  paths  are  related  via  only  via  MaxRole  and 
MinRole. 


□ 

The  role  graph  in  figure  6  shows  only  direct  (non-redundant)  privileges  for  each  node,  and 
has  no  redundant  edges.  Specifying  a  role’s  direct  privileges  and  its  is-junior  relationships 
with  other  roles  completely  specify  its  effective  privileges. 

Definition  13  Direct  Privileges:  Let  Direct(r)  denote  the  direct  privileges  of  a  role;  i.e. 
Direct(r)  C  'l'(r)  such  that  for  all  ri  r,  ^(r,)  n  Direct(r)  =  0.  □ 

For  the  purpose  of  the  algorithms  below,  assume  that  for  each  role  in  a  role  graph,  we  keep 
Direct{r)  and  is-junior  relationships.  By  the  definition  of  is-junior,  the  edge  set  in  the  role 
graph  will  in  fact  be  highly  redundant.  What  we  want  to  present  to  the  role  administrator, 
and  maintain,  is  the  transitive  reduction  of  the  graph  [AGU72].  The  transitive  reduction  of 
an  acyclic  graph  is  a  graph  in  which  there  are  no  edges  r,  rj  whenever  there  is  a  path 
G  the  graph.  Inputs  to  and  outputs  of  the  algorithms  assume  well-formed  graphs. 

Definition  14  Role  Graph  Well-Formedness:  A  role  graph  is  well-formed  if  it  is  a  transitive 
reduction  and  if  the  direct  privilege  set  associated  with  each  role  r  conforms  to  the  definition 
of  Direct{r).  ^ 

By  the  original  definition  of  the  edge  set  (based  in  turn  on  the  is-junior  relationship  which 
depends  on  the  effective  privilege  sets  of  nodes),  a  path  r;  -^+  rj  exists  in  the  well-formed 
role  graph  whenever  C  4'(rj).  The  following  terms  will  be  useful  in  the  algorithms  to 
be  presented  below: 
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Definition  15  Juniors(r)  The  set  of  Junior  roles  for  a  given  role  r  is  all  r,-  such  that  r,  r. 

O: 

Definition  16  Seniors(r)The  set  of  Senior  roles  for  a  given  role  r  is  all  such  that  r  -^+  r, . 

c: 

Constraint  1  Role  Graph  Privilege  Set  Invariant  Constraint:  The  effective  privilege  set  of 
every  role  in  a  role  graph  remains  invariant  unless  altered  by  the  system  securitv  officer,  SSQ. 

□ 

The  SSO  exercises  privileges  like  any  other  system  user  by  executing  in  an  authorized 
security  administration  role.  This  can  be  seen  as  the  security  information  administration 
role  However,  care  must  be  taken  to  ensure  there  is  no  conflict  of  interest.  Hence  no  one 
u.ser.  whether  SSO  or  not,  should  be  able  to  administer  security  information  pertaining  to 
onets  access  rights. 

4.3  Role  Graph  Maintenance  Algorithms 

We  are  now  ready  to  introduce  some  algorithms  to  assist  a  role  administrator  in  specifying  and 
modifying  a  collection  of  roles.  These  will  ultimately  be  incorporated  in  a  role  maintenance 
tf)o]. 

Our  goal  is  to  have  all  the  operations  map  a  well-formed  role  graph  to  another  well-formed 
role  graph.  We  assume  that  the  administrator  begins  with  a  graph  containing  only  MaxRole 
and  MinRole.  Any  direct  privileges  defined  for  MinRole  can  be  specified  at  this  time. 

The  role  graph  can  be  expanded  at  any  point  by  adding  new  roles  need  may  arise 
while  retaining  the  role  graph  structure.  This  strategy  offers  a  flexible  manner  of  introducing 
new  privileges  into  the  role  graph.  Such  privileges  can  be  incorporated  into  an  existing  rol 
graph  by  introduction  of  new  roles  or  by  increasing  the  privileges  of  existing  ones.  New  roles 
can  be  introduced  by  the  addition  of  completely  new  roles,  or  by  partitioning  existing  roles 
either  horizontally  or  vertically.  We  also  consider  role  deletion.  In  all  these  cases,  we  can 
have  an  increment  or  decrement  in  the  overall  privileges  associated  with  paths  in  which  the 
affected  role  participates.  Such  privileges  can  remain  invariant,  be  reduced  or  be  increased 
depending  on  the  operation.  Given  the  space  constraints  here,  we  address  the  cases  where 
(1)  path  privileges  are  introduced  with  the  addition  of  a  new  role,  (2)  path  privileges  may 
or  may  not  remain  invariant  with  the  deletion  of  a  role,  (3)  path  privileges  are  partitioned 
with  the  horizonta.l  partition  of  a  role  and  (4)  privileges  remain  invariant  with  the  vertical 
partition  of  a  role. 

f  onsecjuently .  after  carrying  out  the  operations  on  the  graph,  our  procedures  will  confine 
themselves  with  the  immediate  neighbourhood  of  the  target  role.  In  other  words  we  look 
for  redundant  arcs  generated  due  to  the  operation  in  question.  This  involves  the  immediate 
senior  and  immediate  junior  role  sets  of  the  roles  affected  by  the  operations. 

4.3.1  Role  Addition  &:  Deletion 

By  role  addition  we  mean  the  creation  and  incorporation  of  a  totally  new  role  into  the  role 
giaph.  Such  a,  role  is  defined  (name  and  privilege  set)  before  being  integrated  into  the  role 
graph.  While  the  integration  process  must  preserve  the  role  definition,  it  is  important  to 
ensure  that  if  there  are  privileges  defined  in  the  new  role  that  exist  in  junior  roles  in  the 
target  paths,  they  must  be  removed  to  take  away  the  redundancy.  To  introduce  such  a  rol"; 
requires  the  specification  of  the  target  paths  and  the  position  in  the  paths.  This  involves  the 
specification  of  the  target  superior  and  junior  role(s)  for  the  role  to  be  added  (see  figure  73,') 
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Target  Junior  and  Senior  Insert  Role  Eliminate  Redundant  Arcs 


The  role  to  be  inserted  is  added  to  the  node  set  of  the  graph,  and  the  appropriate  edges 
are  created  to  indicate  the  immediate  junior  and  superior  roles.  It  is  possible  that  a  node 
already  exists  with  the  same  effective  privilege  set.  Once  this  possibility  has  been  eliminated, 
redundant  paths  are  removed  from  the  resulting  structure.  Finally,  privilege  resolution  is 
done  to  remove  redundant  privileges  from  Direct  of  the  new  node,  and  privileges  in  nodes  in 
Seniors  of  the  the  new  node  made  redundant  by  this  insertion.  Note  that  for  a  node  r,  the 
set  Seniors!  r)  can  be  enumerated  by  a  depth-first  search  in  the  role  graph  starting  at  node 
r  [CLR90,  Man89].  Similarly,  the  set  Juniors(r)  can  be  computed  by  a  depth-first  search  of 
the  graph  formed  by  reversing  the  edges  in  the  role  graph,  again  starting  the  search  at  node 
r.  The  details  of  these  operations  will  not  be  given  here. 

.41gorithm  1  in  figure  8  and  also  figure  7  illustrate  the  role  addition  process. 

The  flip  side  of  role  addition  is  role  deletion  which  involves  the  elimination  of  a  role  from 
the  role  graph.  This  process  requires  specifying  the  target  role  and  short-circuiting  it  by 
making  the  target’s  immediate  subservient  role(s)  the  immediate  subservient  role(s)  of  the 
target’s  immediate  superior(s).  In  doing  so,  the  privileges  associated  with  the  deleted  role  can 
either  be  eliminated  or  distributed.  Privilege  elimination  involves  overall  privilege  reduction 
of  the  path  associated  with  the  role  so  deleted. 

Retaining  the  privileges  of  the  deleted  role,  on  the  other  hand,  requires  a  specification  of 
how  these  privileges  will  be  distributed  among  the  existing  roles.  It  is  reasonable  to  assume 
that  such  role  deletion  would  not  affect  the  effective  privilege  sets  of  any  superior  roles  of 
the  deleted  role.  Hence  such  privileges  must  be  transferred  to  the  immediate  superiors.  This 
would  ensure  path  privilege  invariance.  This  case  is  illustrated  pictorially  in  figure  9.  See 
the  associated  algorithm  2  of  figure  10. 

Example  3  Suppose  our  target  role  for  deletion  is  role  D  in  figure  9a  with  the  constraint  that 
all  existing  paths  must  keep  their  privilege  sets  invariant.  For  this  purpose  we  choose  to  shift  the 
privilege  set  of  the  target  role  to  its  superiors. 

To  achieve  this,  first  transfer  the  privileges  from  role  D  to  both  F  and  G  which  are  both 
superior  to  D.  This  results  in  roles  roles  FX  and  GX  which  we  make  immediate  superiors  of 
both  A  and  B.  The  previous  edges  incident  to  role  D,  i.e.  A  D  F,A  D  G,B 


15 


Algorithm  1  Role^ddition(rg,  target,  si 


!?.target-set,j-targei,set) 


/“  l-'or  the  addition  of  a  given  role  into  a  role  graph  */ 

Inpi’.t:  rg  =  (7?,,  — (the  role  graph),  target  role  to  be  added  (role  name  along  with  its  proposed  direct  privilege  set). 

s  target. set  (immediate  superior  set  for  the  target),  j. target. set  (immediate  junior  set  for  the  target). 

Output:  The  role  graph  with  target  added  and  overall  privileges  of  other  roles  left  intact. 

Var  r.  .  ve :  roles; 

Begin 

If  3(’r5  —+  r-j)  for  any  rg  e  s. target. set.  r^  ^  j. target. set 

Then  abort  y*  Must  not  violate  acyclicity  */ 

Else  Regin 

1  l-CiargeOi^  ( r  ))U  Dirertf  target); 

/*  C’ompute  the  effective  privilege.^  of  target  role  */ 

2  If  'l'(r)  =  'if (target)  for  any  r  ^  7Z 

Then  target  :=  r,  /*  Role  privilege  ?ets  must  be  unique  */ 

d  If  3(targpt  — )  for  any  £  j. target  ..9-t 

Then  abort  /*  Must  not  violate  acyclicity  */ 

Fllse  Begin 

a.  7i  TZ  \J  target.  /*  Add  target  to  system  roles  */ 

b  For  all  r^  €  ?. target  set  do  add  the  edge  target  — »  r^, 

F'or  all  r^  ^  j  target  set  do  add  the  edge  -j  —  target. 

d.  If  for  any  r  G  7^.  >f'(r)  C  'i'itarge.ti  and  NOT(r  -+  target) 

Then  add  the  edge  —  target.  /*  Add  thi.f  inferred  edge  '“/ 
e  If  for  any  r  G  le,  '!'( target)  C  '!'(  r  )  and  NOT(  target  — +  r) 

Then  add  the  edge  target  —  r.  /»  Add  this  inferred  edge  */ 

f.  Rem_Rcd_Arcs(rg,  j  target. set,  s. target  set,  target); 

g.  Red_Priv_R.es(  rg,  j  .target  set ,  target); 


For  all  r,  Tj,  Tj  6  7^  if  '{'{  r^  )  =  'J' (  )  then 

Begin  for  all  — *  r  do  add  the  edge  r,  — *  i 

for  all  r  — ►  Tj  do  add  the  edge  r  — 
Delete  all  edges  — •  r  and  r  —  r, ; 

Remove  ;  end ; 


/*  Remove  any  duplicate  roles  */ 


/•  Role_Addition  •/ 

Pjocodure  Rem_Red_Arr5(var  rg;  role  graph;  j. target  set.  s  target  set  rolc^sct;  target:  role  ); 

/”  Removes  redundant  arc.^  in  the  immediate  neighbourhood  of  target  role  */ 

Va r  r^.  r . ,  r  ,c  roles: 

Begin  i  For  all  g  ]  target. se.t  do  y*  Remove  direct  paths  where  there  is  another  path  ’“f 

if  3(  — •  -  •  •  — •  target)  then 

Delete  the  edge  —  target  y»  delete  the  direct  edge*/ 

2  For  all  r's  G  s. target. set  do 

if  3(  target  —  —  -  •  -  —  r.a  )  then 

Delete  the  edge  targef  ^  /*  delete  the  direct  edge*/ 

/*  Red.Red_Arcs  •/ 


I’rorodiiro  Refl_Priv_lle.~' var  rg:  role  graph  i  target  set  ro 
'.'ar  pr  privilege,  r  .  role; 

Begin  I  For  all  r  in  Seniors!  r)  do 

For  all  pt'  G  /^irect{  target)  do 
if  pv  G  Dtr€ct(r)  then 

ntrer.t(r)  :=  Dircrt{  r)  —  pr 
[''or  all  r  in  ] .target. set  do 
For  all  pu  G  'l'(r)  do 

if  pu  G  -D  jrect(  target )  then 

Dire.et(targe()  :=  Dtrerti  target)  —  pr 


le_sct  target:  role); 

/*  remove  redundant  privileges  from  senior  roles.  */ 


/*  remove  redundant  privileges  from  Direct(iarget ) 


/*  Hcfl_Priv_R.es  */ 


Figure  8:  Algoritlim  for  Role  Addition 
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Figure  9:  Role  Deletion 


D  —  F,  and  B  D  G  are  replaced  by  A  ^  FX,A  —>■  GX,B  FX,  and  B  GX, 
respectively. 

The  next  move  is  to  do  away  with  redundant  alternative  paths  (marked  X  in  the  figure  9b) 
and  remove  them.  We  notice  that  paths  A  ^  C  FX  and  B  E  GX  contain  the  set 
of  privileges  of  paths  A  —>■  FX  and  B  —*  GX ,  respectively.  This  results  in  a  new  role  graph 
structure  as  shown  in  figure  9c.  □ 

Both  role  addition  and  deletion  correspond  to  real  life  situations  where  in  creating  a  new 
portfolio,  a  new  role  is  added  while  in  eliminating  some  “office”,  a  role  will  be  deleted.  Role 
deletion  without  privilege  reduction  entails  elimination  of  some  “office”  in  an  organization 
while  retaining  the  total  functionality.  Privileges  of  the  deleted  role  would  be  distributed  to 
other  roles. 

4.3.2  Role  Partition 

A  role  can  be  partitioned  into  two  or  more  roles  in  our  role  graph.  Essentially,  the  basic 
partition  operations  are  either  vertical  or  horizontal,  and  can  of  course  be  combined.  In  both 
cases  it  must  be  specified  what  the  new  roles  and  their  corresponding  privileges  are.  Where 
the  order  of  “seniority”  is  required,  as  in  the  case  of  vertical  partition,  it  must  be  specified 
as  well. 

In  vertical  role  partition,  a  role  is  split  into  two  or  more  roles  and  an  ordering  is 
imposed  on  them  with  the  relationship.  In  doing  vertical  partition,  we  must  specify 

the  target  role,  the  new  roles  to  be  created,  their  direct  privileges  and  their  ordering  (ac¬ 
cording  to  partial  privilege  criterion).  For  instance,  a  role  X  is  not  only  partitioned  into 
roles  .\i ,  •  •  ■ ,  but  also,  these  roles  must  be  ordered,  e.g.  Xi  ^  (see  figure  lib 

and  algorithm  3  of  figure  12).  Privilege  distribution  among  the  new  roles  is  constrained  by 
the  privileges  associated  with  the  role  being  partitioned;  there  must  not  be  an  increment  or 
decrement  of  privileges,  i.e. 

Direct{X)  =  [j  Direct{Xi) 

Consequently,  the  privileges  associated  with  the  paths  in  which  the  role  appears  neither 
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Algorithm  2  Ro!e_Deietion(rg,  target,  inv) 


/*  A  specified  rolr  retaining  or  discarding  its  privileges  depending  on  inv 

Input:  rg  =  <7^ .  — )  (the  role  graph  structure),  target  (  the  target  role  to  be  drdeted). 

rn’>  Boolean  indicating  whether  or  not  to  retain  the  role's  privileges 
Output.;  The  role  graph  structure  with  target  deleted 


V.\r  s.set  I  set.  role  set;  r  r^.r^:  role; 

Begin  1  s.set  .=  Superior_Set(targct  1. 

2  j.set  ;=  .lunior_Set(target  1, 

For  all  Tf  .s.set  do 

For  all  1  j  g  J.set  do  add  —  r,e, 

'1  If  f nti  then  do 

F'or  all  r.s  g  s.sef  do 

DtTert(rs)  ■■=  Dirert(ry)  U  Dtrrcti  tar  get); 
For  al!  /'«  ^  s.set  do 
For  all  Tj  G  J.set  do 

U  30'^  t  s  )  then  <lelete  —  r.«j 

i;  .k’  =  TZ  —  target; 

end 


/*  Get  the  senior  set 

I*"  Get  the  junior  set  */ 

/*  Connect  .Junior  and  Senior  Roles  */ 

/*  Transfer  Privileges  to  superiors  */ 
/*  Remove  all  redundant  arcs  */ 

/*  Take  out  target  from  system  roles 
/*  IloIc»Deletion  */ 


Function  .S.ipenor_Set(var  rg:  role  graph ;  target  role):  role_sct: 

VarTempset:  role_set,  r:  role; 

Begin  1.  Tempset  :=  0; 

2  For  all  i  with  target  — •  r  do 
Teinp.set  :=  Tempset  U  r; 

•'V  >ui[)eriof  J=:c}.  =  Tempset 

/*  Superior_Set  */ 


Function  .Iijnior_Sel(  var  rg:  role  graph;  target,  role)  role_sei, 
Var  Tempset:  role_«et,  r;  role; 

Begin  1,  Tempset  :=  0; 

For  all  r  with  r  —  target  do 
'I'empset  =  Tempset  U  r; 
d.  .Innior_bet  =:  Tempset 

efid 


/*  J unior J>el  */ 


Figiiro  10:  .Algorithm  for  Role  Deletion 


Figure  11:  Vertical  &  Horizontal  Role  Partition 

decrease  nor  increase.  In  general,  vertical  partition  leaves  the  privilege  set  associated  all 
paths  unaffected;  only  the  path  length  increases. 

Further  constraints  include  the  requirement  for  distinct  direct  privilege  sets  for  the  newly 
created  roles,  i.e.  for  any 

Xi,Xj  e  {Xi,  -  ■  ■  ,Xn},Direct(Xi)f^Direct{Xj)  -  0 

Suppose  we  have  a  target  role  for  partition  (call  it  X)  with  a  relationship  { J] ,  ■  ■  • ,  J„}  X  ^ 
{5i,  •  ■  •  which  is  partitioned  vertically  into  roles  {Xi,  •  •  •  ,X„}  such  that  { Ji,  ■  •  • , 

{Xi  Xn}  {5i,-  •  •  ,5„}.  It  follows  that  (X„  C  (5i  0  52  0  •  •  •  0  5„))  A  (Xi  C 

( ©  ^2  ©  •  •  •  0  Jn))- 

Horizontal  role  partition,  on  the  other  hand,  involves  partitioning  a  role  into  two  or 
more  roles  with  none  of  them  being  subservient  (superior)  to  another  (see  figure  11c  and 
algorithm  4  of  figure  13).  Partition,  as  used  here,  merely  distributes  the  direct  privileges 
of  the  target  role  among  newly  created  roles  that  replace  it.  In  partitioning  a  role,  there 
should  be  no  effective  increment  or  decrement  of  privileges.  In  other  words,  as  with  vertical 
partitioning,  if  role  X  is  partitioned  into  roles  Xi,  •  •  •  ,Xn,  we  require  that 

Direct(X)  -  [J  Direct(Xi) 

The  direct  privilege  sets  of  these  newly  created  roles  can  have  empty  or  non-empty  intersec¬ 
tions.  However,  none  of  them  should  have  identical  privilege  sets.  Note  that,  unlike  vertical 
partition,  horizontal  partition  can  cause  a.  variation  of  privileges  associated  with  a  path  when 
the  target  role  is  the  senior-most  role  in  the  path. 

Suppose  we  have  a  target  role  for  partition  (call  it  X)  with  a  relationship  {Ji,  •  ■  • ,  J„} 

X  ^  {5i,  •  •  •  ,5„}  which  is  partitioned  horizontally  into  roles  {Xj,-  •  •  ,Xn}  such  that  {Ji,  ••  ■  ,Jn} 
{.Yi,---,X„}  {5i,---,5„}.  It  follows  that  ({Ji,---,J„}  C  (Xi  0  X2  0  •  •  •  0  X„))  A 

({5i ,  •  ■  • ,  5„}  C  (Xi  ®  X2  ©  •  •  •  ®  Y„)) 

Updates  to  the  role  graph  include  the  reduction  and  addition  of  role  privileges  which 
require  the  specification  of  the  target  role  and  privileges  to  be  removed/added,  but  do  not 
alter  the  basic  structure  and  relationships  in  the  role  graph  structure.  These  may  be  addressed 
within  the  context  of  role-privilege  authorization. 
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Algorithm  3 

VerticalJPartition{rgf 


target,  {((j^ ,  )}) 


/*  Partitions  a.  given  role  vertically  */ 

Input:  rs  =  (TC,  {the  role  organization  structure),-  target  (the  target  role  to  be  partitioned), 

{((■tj,  — ),  rpeetj)}  (the  new  role-direct  privilege  set  pairs  and  their  ordering). 

Output.  The  role  graph  with  target  vertically  partitioned  into  {((l, .  —  1,  rpset,  )}  and  integrated 
into  the  role  graph  structure. 

Uses  SupGrior_Sot  and  .lunior^Set  of  algorithm  2  in  figure  10, 

\'a)  :.-scl,  j.set  role  set;  roles; 

Begin 

[f  Dire.r.t{target)  ¥  \J{d, recti  T,)) 

Then  abort  /*  Must  keep  privilege  sot  invariant  *! 

Else  Begin 

1.  H  TZ  {J  {xj};  j*  Add  new  roles  to  system  */ 

2  s.set  ;=  Superior.Set(  target ),  j*  Generate  superior  set  */ 

3.  j  set  :=  Junior^et(target);  /*  Generate  Junior  set  */ 

4  Add  edges  —  X2 ,  .r2  -*  x.-, .  •  •  .Xn_]  an;  /*  Create  a  Path  as  specified  */ 

•S  Por  ail  x^  do  Dircct{x^)  :=  rpset,  /*  Assign  the  appropriate  privileges  */ 

6.  .For  all  G  s.set  do  add  xn  ^  r.,,  /*  Join  the  Senior  end  */ 

7  For  all  G  do  add  v.  .r,  ^  /“  Join  the  Junior  end  */ 

8.  1.  JZ  ~  target;  Delete  target  from  system  */ 

end:  ' 

end  -r  . 

!  V  ertical_Partition  */ 


Figure  12;  Algorithm  for  Vertical  Partition 


Algorithm  4  Flon7.onlal_Partition(rg,  target.  {(  Xj ,  a,  .T'pset, )  f 


Partitions  a  givon  role  horizontally  */ 

Input;  i-(,  =  (7c.  ~>  ithc  role  organization  .structure),  target  (the  target  role  to  be  partitioned)  {(x-  rp'^ct  )} 
(the  new  role-dirert  privilege  pairs  to  replace  target)  c  iv  t  r  iii 

Output:  I'he  role  Ururliire  with  target  horizontally  p.rriuioned  into  j(l,)>  ^nd  integrated  into  rg. 
iJsos  ^uipenor^Sei  and  ,)iinior_Set  of  algorithm  2  in  fiptire  10 


\’ar  ivt.  I  set  role  set.  roles; 

Begin  If  .''trectf  target )  x,  )  J 

Then  abort 
Fisc  begin 

1  7v’  =  K  u  (Xj), 

2  s.s.el  :=  S  uperior_Set(  target  ): 
j  ,-ot  =  JuniorjSet(t,-i,rget ), 

4  For  all  ,Tj  G  {xj  ,  •  -  ,  xn  }  do 
Direcf(.tj)  :=  rp^et,  , 
fv  :=  7?  —  target; 
fy  For  all  x,  6  {xj  ,  •  .  .m  ]  do 

begin  For  all  G  do  add  .t,  —  r., 

For  all  G  t  do  add  >  ^  —  Xj, 

end; 

For  all  7-,r,  ,  G  TC  if  •J'{  r,  )  =  'K  ?• ,  )  then 
Begin 

for  ail  Tj  — *  r  do  adci  the  edge  7j  —  r 
for  all  r  — •  r,  do  add  the  edge  r  — 
Delete  all  edges  ?•,  •—  r  and  r  — .  , 

n  ernove  r, ; 


/*  Must  keep  privilege  set  invariant  */ 

/*  Add  new  role.s  to  system  Roles  */ 

/*  Generate  the  superior  set  f 
/“  Generate  the  junior  set  */ 

/•  assign  the  privilege  set  to  the  new  role  */ 
/*  Astffgn  respective  privilege  sets  •/ 

/*  Delete  target  */ 

/*  1-ink  New  Ftoles  to  seniors  */ 

I*  i/irik  New  Roles  to  juniors  */ 

/*  K.ernove  any  duplicate  roles  */ 


/*  Horr/.ontal_Partilion  */ 


Figure  13:  .41gorithin  for  Horizontal  Partition 
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4.4  The  Role  Graph  &  Role  Coupling 

Considering  our  role  graph  model  proposed  in  section  4.2,  we  term  the  extent  of  linkage 
between  roles  a  coupling  which  is  related  to  the  extent  to  which  privileges  are  shared  among 
roles.  We  can  have  a  variety  of  cases,  e.g.  where  each  role  is  independent  of  all  others  or 
where  some  roles  are  coupled  and  hence  dependent  on  each  other. 

Definition  17  Coupling:  Coupling  exists  between  two  roles  ri  and  rj  if  such  that  G 
ri  9  Tj  and  Vk  7^  MinRole.  We  call  rj,  a  coupling  role  between  ri  and  rj.  □ 

Definition  18  Role  Independence:  Two  roles  ri  and  rj  are  independent  if  and  only  if  ri  Q 
r,  =  {MinRole},  i.e.  their  only  coupling  is  the  role  common  to  all  roles  in  the  role  graph.  In 
other  words  their  only  greatest  lower  bound  is  MinRole.  □ 

Independent  roles  have  no  coupling  between  them. 


5  Comparison  with  Hierarchies,  Privilege  Graphs  &  Others 


The  role  graph  model  presented  here  can  simulate  a  hierarchical  organization.  We  can  convert 
a  role  graph  into  a  tree  (hierarchy )  and  vice  versa.  To  obtain  a  tree  from  a  given  role  graph,  we 
designate  MaxRole  as  the  root  of  the  hierarchy  and  do  a  recursive  bread-first  or  depth-first 
traversal  for  every  node  with  a  relationship  with  MaxRole.  A  given  path  terminates  when 
MinRole  is  encountered  which  forms  the  leaves  of  all  paths  in  the  resulting  tree  (hierarchy). 
This  tree  contains  all  paths  present  in  the  associated  role  graph.  In  going  from  a  tree  to  a 
role  graph,  we  designate  the  root  of  the  tree  to  be  MaxRole,  do  a  depth-first  traversal  of  the 
tree  and  equating  nodes  whenever  equal  privileges  are  encountered.  The  resulting  role  graph 
can  then  be  augmented  with  MinRole  if  necessary.  The  advantage  with  the  role  graph  is 
its  compactness,  i.e.  shared  nodes  lower  in  the  hierarchy,  need  not  be  duplicated.  This  is  a 
major  advantage  in  that  it  reduces  the  extent  to  which  shared  privileges  are  scattered  among 
roles  which  makes  the  task  of  tracking  their  use  easier. 

To  simulate  privilege  graphs  [Bal90],  attach  to  every  role  an  associated  functionality  that 
specifies  the  associated  duty  requirements/title/etc.  With  the  role’s  access  control  list  (rad) 
acting  as  the  user/group  node  (figure  2),  it  is  possible  to  determine  the  authorized  users 
for  any  role.  An  authorized  user’s  access  rights  are  determined  by  the  effective  privilege  set 
'P(r)  of  the  associated  role  r  to  which  the  user  is  authorized.  Further,  remove  MaxRole 
and  assign  its  explicit  privileges  to  roles  with  direct  partial  privilege  relationship  with  it.  As 
well,  remove  MinRole  and  assign  its  privileges  to  those  roles  with  a  direct  partial  privilege 
relationship  with  it.  The  result  is  a  privilege  graph. 

Finally,  although  this  model  is  based  on  subsets  with  an  acyclic  graph,  it  is  different 
from  the  Bell  and  LaPadula  Model  (BLPM).  Moreover,  although  both  are  meant  for  security 
application,  they  have  different  approaches  to  realizing  protection.  The  BLPM  rehes  on 
subsets,  acyclicity  and  is  static.  However,  it  is  based  on  the  classification  of  information  as 
opposed  to  the  execution  of  operations  as  is  the  case  in  our  model.  The  BLPM  specifies  two 
simple  operations  of  either  read  or  write  access  depending  on  object  classification  and  subject 
clearance.  This  approach  realizes  multilevel  security.  In  our  model,  privileges  represent  pre¬ 
defined  executions  designed  in  a  manner  intended  to  realize  certain  desired  functionality  in 
a  system.  These  operations  are  designed  from  considerations  of  desired  system  functionality. 
Once  defined,  the  operations  are  distributed  among  roles  in  the  system  in  the  manner  that 
suits  organizational  requirements.  The  executions  can  be  simple  reads  and  writes.  They  can 
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be  a  combination  of  simple  reads  and  writes.  But  they  can  also  be  complex  executions  such 
methods  in  object-oriented  programming.  These  operations  need  not  merely  alter  or  return 
the  information  relating  to  a  given  object  but  can  also  create  other  objects  and  invoke  other 
operations. 

In  the  BLPM,  once  classification  has  been  done,  access  to  information  is  governed  r  . 
the  simple  security  property  and  the  *-property.  Its  specification  is  static.  In  our  mode' 
execution  of  privileges  can  cause  the  assignment  or  revocation  of  privileges  pertaining  to 
some  role.  In  that  respect,  our  model  is  dynamic. 

6  Summary  &  Conclusions 

It  is  important  to  have  a  means  for  role  organization  that  reduces  the  complexity  of  privilege 
maTiagement  in  a  role-based  security  system.  This  paper  has  presented  a  model  for  role 
organization  derived  from  three  basic  role  relationships,  viz:  partial,  shared  and  augmented 
privileges.  These  lead  to  a  role  graph  formulation  and  use  of  role  graph  theory.  The  model 
allows  for  the  assignment  of  privileges  in  a  particular  role  and  through  role  relationships,  we 
determine  the  extent  of  privilege  sharing.  Given  the  acyclicity  property,  the  role  graph  model 
facilitates  role  partial  ordering  and  privilege  subsetting  among  roles.  With  an  appropriate 
assignment  of  privileges  to  roles  and  specification  of  role  relationships,  the  role  graph  can 
ease  the  task  of  access  rights  rights  administration  in  a  system.  Our  model  has  the  expressive 
power  of  both  hierarchies  [TDH92]  and  privilege  graphs  [Bal90]. 

The  issue  of  role  administration  was  addressed  and  algorithms  for  role  management  pre¬ 
sented.  These  include  algorithms  for  role  addition,  deletion  and  split  (partition).  Central  to 
role  management  is  the  concept  (T  the  change  (or  lack  of  change)  of  path  privileges,  because 
path  privilege  changes  have  implications  for  roles  with  indirect  access  to  these  privileges. 

The  concept  of  paths  in  the  role  graph  is  important  in  that  specific  types  of  processinj; 
can  be  associated  with  specific  paths.  Since  there  is  privilege  sharing  among  roles  within 
a  path,  one  can  impose  constraints  about  the  order  of  role  participation  in  the  processing 
as  well  as  separation  of  duty  requirements.  Role  and  path  independence  are  important  for 
cases  with  conflict  of  interest.  'I'wo  types  of  processing  that  conflict  can  be  associated  with 
independent  paths  and  by  ensuring  that  no  user  is  authorized  for  roles  from  both  paths,  we 
can  impose  conflict  of  interest  restriction  to  jirocessing. 

Currently,  we  are  involved  in  the  implementation  of  a  role  management  tool  which  we 
hope  will  give  further  insight  into  the  applications  of  the  role  graph  model  in  access  rights 
adrninistrat  ion. 
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Abstract 

When  there  are  a  large  number  and  variety  of  users  in  a  system,  a  large  number  of  autho¬ 
rization  rules  is  required  to  define  their  access  rights.  Because  of  their  number  and  variety  these 
rules  become  too  difficult  and  cumbersome  to  maintain,  the  authorization  evaluation  algorithms 
are  not  efficient  and  their  storage  takes  up  a  large  amount  of  memory.  Also,  it  is  hard  for 
security  administrators  to  understand  why  a  specific  user  is  given  a  set  of  rights.  The  solution 
is  to  have  groups  of  users  rather  than  individual  users  as  subjects  that  receive  access  rights 
from  the  authorization  system.  While  several  approaches  to  grouping  users  have  been  presented 
they  are  not  powerful  enough  to  describe  a  wide  range  of  logical  groupings.  In  this  paper  we 
develop  a  generalized  approach  to  user  group  structures  to  solve  these  problems.  We  present 
structurings  and  primitives  for  user  groups  based  on  object-oriented  concepts  which  are  more 
powerful  and  general  than  those  presented  until  now.  Although  formulated  in  the  context  of 
an  object-oriented  database  system,  our  approach  is  general  and  could  be  applied  to  other  data 
models,  and  even  to  operating  systems. 

Keywords;  Access-matrix-based  security,  Authorization  models,  Data  administration.  Database 
security,  Discretionary  access  control,  Security  of  computer  systems. 
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1  Introduction 


Computer  installations  keep  increasing  in  complexity:  the  lower  cost  of  equipment  and  the  tendency 
towards  decentralization  and  distribution  has  increased  the  number  of  users  that  have  some  type  of 
access  to  the  computing  system.  In  this  scenario,  the  control  of  security  becomes  a  more  dilRcult 
problem  and  security  administrators  need  ways  to  help  their  work.  A  large  number  of  authorization 
rules  is  required  when  access  rights  are  defined  using  access-matrix-based  models.  These  rules  are 
difficult  to  maintain  and  store  and  it  is  hard  for  a  security  administrator  to  understand  the  security 
implication  of  each  rule. 

Groups  as  accessing  entities  instead  of  individual  users  allow  administrators  to  place  users  with 
similar  functions  in  the  same  group  and  authorization  rights  can  then  be  given  with  respect  to  the 
group.  This  improves  the  efficiency  of  the  system  because  the  number  of  authorization  rules  to 
be  handled  decreases  dramatically.  It  also  has  the  advantage  of  making  possible  the  application 
of  institutional  policies  in  the  definition  of  the  groups;  for  example,  all  secretaries  have  similar 
functions  and  can  be  given  a  package  of  common  rights.  Because  the  number  of  distinct  functions 
in  a  typical  institution  is  not  very  large  the  total  number  of  groups  is  reasonably  low.  Most 
authorization  models  include  groups  in  one  way  or  another.  Groups  are  another  application  of 
the  concept  of  implied  authorization  [9],  where  an  authorization  rule  defined  with  respect  to  some 
composite  structure  (data  or  users)  gives  similar  rights  for/to  each  component. 

We  develop  here  a  unified  framework  for  groups.  We  first  consider  unstructured  groups  and 
formulate  an  authorization  model  based  on  object-oriented  concepts.  We  also  define  a  set  of  pro¬ 
cedures  to  create,  destroy,  and  manipulate  user  groups.  We  then  extend  this  analysis  to  groups  of 
groups  and  present  an  additional  set  of  procedures.  In  particular,  we  propose  three  group  structur¬ 
ings  in  analogy  with  the  possible  associations  used  in  object-oriented  modeling.  We  also  show  how 
to  use  these  groups  in  evaluating  access  requests  and  to  implement  different  security  policies.  Ex¬ 
pressing  the  authorization  system  as  a  set  of  classes  and  associations  in  an  object-oriented  database 
allows  the  system  to  protect  the  authorization  rules  in  the  same  way  as  the  rest  of  the  data  in  the 
database.  If  the  database  is  of  another  type  our  approach  is  still  useful  as  a  design  guideline. 

Many  authorization  systems  have  used  the  concept  of  groups,  e.g.,  [1],  [2],  [3],  [5],  [11],  [12], 
[13],  [17],  [21];  our  approach  is  more  general  in  that  all  of  these  models  can  be  shown  to  be  special 
cases  of  ours.  Our  approach  thus  provides  a  unifying  framework  for  all  those  models.  Some  of  the 
most  interesting  earlier  approaches  are: 

•  The  Inventory  Control  System  (ICS),  developed  at  IBM  [12].  In  their  approach  a  group  is  an 
entity  with  which  users  and/or  groups  may  be  associated  to  access  resources.  They  defined 
a  set  of  special  access  types  to  handle  groups,  including  run  (use  a  resource  in  a  group),  use 
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(use  resources  and  create  files  in  the  group),  control  (implies  create  and  also  allows  a  g:  :.!-  :> 
member  to  connect  other  users  to  the  group),  and  join  (implies  control  and  permits  a  i ' 
to  define  new  system  users  and  new  subgroups).  Four  data  structures:  group,  user,  conn  : 
and  file,  define  the  data  needed  to  describe  the  group  structure. 

•  SQL/Oracle.  R.  Baldwin  proposed  the  concept  of  Named  Protection  Domains  (NPD’s), 
and  applied  it  to  manage  security  in  SQL  relational  databases  [2].  An  NPD  is  a  collection 
of  application-oriented  rights  given  to  a  set  of  users  with  similar  functions.  NPD’s  can  be 
structured  into  trees  of  rights  (the  higher-level  NPD’s  include  the  rights  of  the  lower-level 
NPD’s). 

•  The  IRIS  database  system.  This  is  an  object-oriented  database  developed  at  Hewlett  Packard 
[1].  Users  are  described  by  class  user  which  includes  operations  to  create  and  destroy  user 
objects,  return  valid  user  names,  and  match  passwords.  A  class  group  describes  sets  of  users 
to  which  rights  can  be  granted.  Groups  can  be  structured  into  trees  similarly  to  the  NPB’s 
described  above.  There  are  also  special  operations  for  groups,  e.g.  to  list  the  members  of  a 
group. 

•  Kelter  did  a  systematic  study  of  group  structure  in  [13]  and  [14].  Additionally  to  the  concert 
of  tree-structured  groups  where  supergroups  include  the  rights  of  subgroups,  he  introd. 
the  concept  of  specialized  subgroups  that  inherit  the  rights  from  their  supergroups. 

•  Bertino  and  Weigand  [3]  consider  also  inheritance  from  supergroups  into  their  subgroups. 
They  consider  also  the  effect  of  inheriting  content-dependent  predicates  (which  we  don’t 
consider  in  this  paper). 

•  Brwggemann  [5]  defines  inheritance  in  subject  groups  and  considers  also  the  effect  of  negative 
authorization  rules. 


Some  authorization  systems  for  object-oriented  databases  also  use  data  grouping  to  reduce 
the  number  of  authorization  rules,  e.g.  [5]  and  [15]  use  the  granularity  of  the  data  as  a  criterion 
for  implied  access,  access  to  a  class  implies  access  to  all  the  instances  of  that  class,  etc.  Other 
approaches,  e.g.  [7],  take  advantage  of  the  semantic  data  structuring:  access  to  a  class  implies 
similar  access  to  its  subclasses.  These  groupings  are  orthogonal  to  the  user  groupings  considered 
here  and  could  be  combined  with  them  for  further  reduction  in  the  number  of  access  rules.  However, 
user  groupings  do  not  require  any  data  grouping  to  be  effective  and  can  be  used  with  any  date 
model. 

Several  authors  have  proposed  the  concept  of  user  roles,  where  a  role  is  a  group  of  users  that 
has  specific  functions  [2],  [21].  We  can  consider  roles  as  defining  policies  and  groups  as  mechani.srr^s 
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to  apply  these  policies.  Here  we  concentrate  on  mechanisms  for  group  structuring  regardless  of 
their  use  as  specific  roles.  In  fact,  our  study  is  complementary  to  that  type  of  studies. 

Section  2  reviews  basic  concepts  of  object-oriented  systems  and  of  the  authorization  model. 
Section  3  discusses  several  issues  related  to  groups.  In  Section  4  we  propose  policies  and  procedures 
to  handle  unstructured  groups.  Section  5  considers  structures  consisting  of  groups  of  groups. 
Section  6  considers  access  request  validation  while  Section  7  provides  a  complete  example  of  the 
application  of  these  structures.  Section  8  provides  conclusions  and  directions  for  future  work. 

2  Background 

2.1  Object-oriented  concepts 

There  is  a  variety  of  models  that  use  object-oriented  concepts  to  specify  a  system  design.  They  can 
be  used  as  a  semi-formal  model  that  is  relatively  precise,  that  can  be  easily  formalized,  and  can  be 
used  as  a  basis  for  implementation.  To  be  concrete  we  adopt  one  of  these  here;  the  Object  Modeling 
Technique  (OMT)  [19]  which  will  be  used  throughout  this  paper.  OMT  is  a  methodology  which 
consists  of  three  submodels:  The  object  model  represents  the  static,  structural,  “data”  aspects  of  a 
system.  The  dynamic  model  represents  the  temporal,  behavioral,  “control”  aspects  of  a  system.  The 
functional  model  represents  the  transformational,  “function”  aspects  of  a  system.  A  typical  software 
application  incorporates  all  three  aspects:  It  uses  data  structures  (object  model),  it  sequences 
operations  in  time  (dynamic  model),  and  it  transforms  values  (functional  model).  Each  submodel 
contains  references  to  entities  in  the  other  submodels. 

The  object  model  describes  the  structure  of  objects  in  a  system  —  their  identity,  their  relation¬ 
ships  to  other  objects,  their  attributes,  and  their  operations.  Objects  with  the  same  data  structure 
(attributes)  and  behavior  (operations)  are  grouped  into  a  class.  A  class  is  an  abstraction  that 
describes  common  properties  of  some  entity  of  interest  to  an  application.  Each  class  describes  a 
possible  infinite  set  of  individual  objects.  Each  object  is  said  to  be  an  instance  of  its  class.  Each 
instance  of  the  class  has  its  own  value  for  each  attribute  but  shares  the  attribute  names  and  opera¬ 
tions  with  other  instances  of  the  class.  An  attribute  of  an  object  may  take  on  a  single  value  or  a  set 
of  values  [16].  An  object  may  be  an  instance  of  only  one  class.  An  operation  (method)  is  an  action 
or  transformation  that  an  object  performs  or  to  which  it  is  subject.  Attributes  and  operations  are 
referred  collectively  as  features.  Some  features,  denoted  by  $,  are  called  class  features  and  apply  to 
all  the  objects  of  a  class. 

Classes  with  common  properties  can  be  generalized  into  superclasses  which  factor  out  these 
properties  {generalization  association).  Conversely,  subclasses  can  be  said  to  be  particularizations 


of  their  superclasses.  Generalization  is  denoted  in  this  model  by  a  small  triangle  (A). 

Inheritance  is  the  property  of  subclasses  where  they  share  attributes  and  operations  based  on 
their  hierarchical  relationship.  A  class  can  be  refined  into  successively  more  detailed  subclo 
Each  subclass  incorporates,  or  inherits,  all  of  the  features  of  its  superclass  and  adds  its  own  unioae 
features.  A  class  may  have  any  number  of  subclasses.  A  class  may  have  any  number  of  superclasses, 
and  inherits  attributes  and  operations  from  all  of  them;  this  is  called  multiple  inheritance. 

Aggregation  is  a  form  of  relationship  in  which  an  aggregate  object  is  made  of  components.  The 
S'ggregate  is  semantically  an  extended  object  that  is  treated  as  a  unit  in  many  operations,  although 
physically  it  is  made  of  several  parts.  In  diagrams  this  concept  is  represented  by  a  rhomboid  or 
diamond  (o). 

A  relationship  association  represents  the  fact  that  instances  in  different  classes  participate  in 
common  activities.  For  example,  a  student  taking  a,  course  can  be  described  in  this  way.  Multiplicity 
specifies  how  many  instances  of  one  class  may  relate  to  a  single  instance  of  a  related  class  and 
constrains  the  number  of  related  objects.  Multiplicity  is  often  described  as  being  “one”  or  “many”, 
but  more  generally  it  is  a  (possibly  infinite)  subset  of  the  non-negative  integers.  In  diagrams  a 
relationship  is  represented  by  a  link  at  whose  ends  the  corresponding  multiplicities  are  defined  (a 
black  dot  (•)  indicates  “many”.) 

2.2  Security  policies  and  authorization  rules 


In  general,  an  authorization  rule  is  a  tuple  (s,o,a,p,  f  ),  which  defines  that  subject  s  has  authoriza¬ 
tion  of  type  a  (access  type)  to  those  data  values  of  security  object  o  for  which  predicate  p  is  true. 
Subject  s  can  grant  the  access  right  (o,  a)  to  other  subjects  if  the  copy  flag  f  is  true.  Because  these 
rules  can  represent  most  security  policies  this  model  has  been  used  as  a  basis  to  describe  many  of 
the  authorization  systems  for  relational  databases  [9]  and  object-oriented  databases  [7]. 

If  we  are  not  concerned  with  the  control  of  individual  objects  but  with  control  at  the  class  or 
attribute  level  we  do  not  need  to  consider  predicates.  For  those  cases  an  authorization  rule  is  just 
a  triple  (s,o.a)  where  s  is  a  subject,  a  is  an  access  type,  and  o  is  a  class  or  a  set  of  attributes.  If 
we  do  not  allow  subjects  to  grant  rights  to  other  subjects  we  do  not  need  a  copy  flag  either.  For 
space  reasons  we  will  not  discuss  here  granting  of  rights. 

As  said  earlier,  in  systems  where  subjects  may  be  groups  of  users  a  right  given  to  a  group  may 
be  implied  as  a  right  for  all  the  users  in  the  group.  We  assume  here  that  all  the  rights  granted  to 
a  group  are  implicitly  available  to  each  group  member. 

Negative  authorization  rules  are  necessary  to  override  implied  access  rights  and  are  very  usefui 
to  specify  precisely  the  required  authorization  of  some  objects.  For  example,  the  system  described 
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in  [18]  uses  positive  and  negative  authorizations:  a  subject  may  be  denied  access  to  an  object  either 
because  it  has  no  authorization  for  it  or  because  it  has  a  negative  authorization  on  it.  Negative 
authorization  constraints  are  also  required  by  the  Orange  book  (that  defines  standards  for  the 
security  of  commercial  systems)  for  security  classes  B3  and  A1  [6].  Again  for  space  reasons  we  will 
leave  out  the  analysis  of  negative  rights. 

An  important  policy  decision  is  the  separation  of  ownership  from  administration.  In  the  first  case 
users  own  and  administer  their  data;  in  the  second  ca.se  the  information  belongs  to  the  enterprise, 
users  are  given  access  to  it  to  perform  their  functions  and  special  users  (administrators)  control  the 
structure  and  the  use  of  the  information.  Again,  for  conciseness  we  do  not  discuss  administrative 
aspects,  although  clearly  the  user  grouping  operations  would  be  used  by  administrators. 


3  User  Groups 


As  said  earlier,  user  groups  can  be  used  for  efficiency  and  as  a  way  to  define  sets  of  rights  based 
on  the  organization  of  an  institution  [9],  [21].  The  authorization  rules  take  now  the  basic  form 
(p,o,a,p),  where  g  is  a.  user  group  (remember  that  we  left  out  the  copy  flag).  However,  these 
groupings  bring  the  problem  of  how  to  interpret  a  given  access  request  (since  there  is  no  direct 
mapping  now  from  the  components  of  the  request  to  the  components  of  the  authorization  rules). 
Users  may  belong  to  more  than  one  group  (although,  in  general,  they  will  belong  to  only  a  few 
groups).  Two  or  more  groups  may  have  access  to  the  same  objects  (as  well  as  other  objects),  i.e., 
they  may  share  rights.  In  this  case  more  than  one  access  rule  may  be  applicable  to  a  given  access 
request.  For  example  given  a  request  from  user  u,  who  belongs  to  user  groups  gi  and  g^,  for  access 
of  type  a  to  object  o,  there  may  be  two  rules  that  apply: 

ri  :  (£fi,o,a,pi)  and  r2  :  (52, o, a,P2) 

Two  basic  policies  to  handle  this  case  are  [9]: 

1.  The  user  chooses.  Some  systems  require  a  user  to  specify  which  group  applies  for  a  particular 
session.  For  example,  a  Multics  user  who  works  on  several  projects  chooses  one  to  provide 
the  authorization  context  for  a  session. 

2.  The  rules  are  combined.  A  user  can  receive  the  union  of  the  rights  of  the  groups  to  which  he 
belongs.  This  allows,  for  example,  an  overall  minimum  level  of  rights  to  be  specified  for  the 
Universal  or  Public  user  group  (This  group  allows  access  to  a  basic  set  of  objects).  In  our 
example,  the  request  is  valid  if  it  is  authorized  by  either  ri  or  r2. 

Clearly,  intersection  the  rules  of  the  groups  to  which  a  user  belongs  is  not  reasonable;  for 
example  if  she  belongs  to  two  disjoint  groups  she  would  get  no  access  rights! 
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The  two  possibilities  above  may  make  sense  in  specific  environments.  For  example,  [11]  m-aKes 
a  point  for  the  use  of  the  union  of  groups  while  [14]  believes  only  one  group  should  be  active  i  >  a 
user  interaction.  Any  mechanism  for  groups  should  be  able  to  implement  both  of  them. 

Other  issues  that  must  be  considered  to  define  group  structurings  are: 

•  How  are  access  rights  associated?  Are  they  associated  only  with  groups,  or  could  users  also 
have  individual  rights?  There  is  a  tradeoff  between  complexity  and  flexibility  in  these  two 
approaches.  The  simplest,  and  more  uniform  approach,  is  to  associate  rights  only  with  groups 
(we  can  always  define  a  group  with  only  one  user  to  accommodate  special  users)  and  we  use 
this  approach  here. 

•  How  to  structure  groups,  i.e.,  is  it  possible  to  have  groups  of  groups?  As  we  discuss  later 
(Section  .5)  this  can  significantly  enhance  the  power  of  the  model  to  describe  different  security 
policies. 

•  How  the  rights  for  groups  are  defined?  This  is  an  institution  policy,  ideally  groups  should 
correspond  to  user  functions  or  roles  and  the  group  should  receive  the  necessary  rights  for  the 
required  function  to  be  performed  (e.g.,  a  mail  clerk  receives  only  enough  rights  to  perforiT'. 
his  job). 

•  How  to  evaluate  access  using  groups,  i.e.,  how  to  accelerate  access  request  evaluation  be¬ 
taking  advantage  of  group  composition.  We  discuss  this  in  Section  6. 

•  How  to  revoke  granted  rights;  this  is  a  more  general  problem  that  has  been  studied  elsewhere 
[9].  It  depends  on  the  general  problem  of  how  are  rights  granted. 

•  How  to  find  efficient  implementation  methods?  While  of  practical  importance,  this  is  not 
discussed  in  this  paper.  Sandhu  [20]  considered  this  aspect  in  detail. 


4  Policies  for  Unstructured  Groups 

We  start  our  discussion  with  a  system  that  uses  only  independent  groups.  We  call  these  unstructured 
groups.  We  then  consider  groups  that  are  related  to  one  another,  and  we  call  this  kind  of  groups 
structured  groups.  In  this  section  we  will  discuss  the  first  approach  and  develop  a  conceptual 
specification  of  its  behavior,  groups  of  groups  are  considered  in  Section  5. 

We  define  a  group  as  a  set  of  users  with  common  rights.  In  other  words,  access  rights  a:  a 
defined  for  specific  user  groups,  or  inversely,  a  group  can  be  seen  as  a  set  of  rights.  When  we  create 
a  group,  we  are  effectively  creating  a  set  of  rights,  i.e.,  usually  groups  are  defined  with  respect 
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Figure  1:  Class  model  for  authorization  rules 

some  functional  task  (role)  that  has  to  be  performed  and  that  requires  access  to  a  set  of  specific 
data  items.  This  is  a  strict  application  of  the  need  to  know  policy:  a  set  of  rights  for  each  functional 
task.  Formally,  a  group  g  is  defined  as:  jr  =  {«i},  where  «,  is  any  registered  user  of  the  system, 
right{ui)  is  the  set  of  user  «,■  rights,  and  right{g)  is  the  set  of  group  g  rights.  righi{ui)  and  right{g) 
are  related  as  follows: 

(5)0,0)  €  right{g)A  u,  €  fl  -*  (u,,o,a)  6  right{ui) 

We  do  not  define  a  universal  or  public  group,  to  which  each  user  belongs  by  default  and  which 
provides  some  basic  access  rights,  but  require  that  any  right  be  explicitly  given.  A  user  may  belong 
to  one  or  more  groups  according  to  his  functional  tasks  (job  assignment).  All  rights  are  associated 
with  groups,  i.e.  users  only  acquire  rights  by  belonging  to  some  group.  A  new  group  can  be 
created  for  a  single  user  if  there  is  no  group  that  accommodates  her  functions.  If  we  also  apply 
the  policy  of  separation  of  use  from  administration,  groups  can  be  operational  groups,  database 
administrator  groups  and  security  administrator  groups.  However,  as  said  earlier,  that  distinction 
is  not  fundamental  for  our  group  development  and  we  do  not  pursue  it  further  here  (see  [7]  for 
further  discussion). 

As  shown  in  Figure  1,  authorization  rules  can  be  represented  using  OMT  as  a  relationship 
between  group  and  data.  The  “data”  class  shows  the  possible  structurings  of  the  data  in  an 
object-oriented  database.  Class  User  represents  all  the  registered  users  with  id  u.  Two  basic 
operations  are  shown  in  this  class:  Register.user  and  Join.^roup  with  obvious  meaning.  Other 
useful  operations  would  be  Delete.user,  Leave-group,  SDisplay .users,  etc.  [10].  The  relationship 
Member.of  describes  which  users  are  in  what  groups.  Class  Group  represents  the  groups  in  the 
system,  for  which  relationship  Authorizationjrule  defines  the  corresponding  rights.  The  relationship 
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attribute  Method/ Access  type  defines  the  operator  that  the  user  is  authorized  to  apply  to  the  d  iti 
while  Predicate  indicates  the  corresponding  data-dependent  restriction.  In  some  systems  accerr; 
controlled  at  the  read/write  record  level  [7],  [15].  In  others,  at  the  method  level  [8].  The  meriod 
Checkjrights  evaluates  if  a  given  request  is  authorized  for  some  subject.  Check-rights  could  also  ^  ; 
attached  to  data  if  we  think  that  its  invocation  is  the  result  of  accessing  some  specific  data  entity. 
Method  Access-type  returns  the  method  authorized  to  a  user  for  a  given  class.  As  indicated  earlier, 
the  hierarchical  structure  of  classes  and  subclasses  may  also  be  used  to  define  implied  accesses  [?], 
thus  further  reducing  the  number  of  explicit  rules. 

An  example  of  a  class  definition  is  given  below.  (Remember  $  denotes  a  class  operator  as  opposed 
to  an  object  operator.)  Some  of  these  methods  may  not  be  needed  in  specific  implementations  or 
additional  methods  may  be  needed;  that  is,  this  is  only  a  typical  definition. 


Class  GROUP  Is 

g:  string;  -  -  group  identifier 

proc  Create-group  (3);  -  -  Creates  a  new  group  g. 

proc  Delete-group  (g);  -  -  Deletes  an  existing  group  g. 

proc  Divide-group  (g,g[l),  ...,g[n));  -  -  Divides  one  group  into  n  groups. 

proc  Combine-group  (newg,  g(l), g(n));  -  -  Combines  n  groups  into  one  group. 

proc  Add_right  (g,o,a);  -  -  Adds  access  right  (o,a)  to  group  g. 

proc  Delete_right  (g,o,  a);  -  -  Removes  access  right  (o,a)  from  group  g. 

proc  Display -G-attributes  (g);  -  -  Lists  the  attributes  of  a  group. 

proc  Check-rights(if,  o.a);  -  -  Checks  if  a  group  is  authorized  to  apply  a  to  data  o. 

proc  Access-type(g,  o,  a);  -  -  Returns  the  value  of  a  for  a  given  group  with  respect  to  data  o. 

proc  Add-member(p,  u);  -  -  Adds  user  u  to  group  g. 

proc  List_rights(g);  -  lists  all  the  rights  of  a  given  group. 

proc  SDisplay -groups;  -  -  Lists  all  the  groups  in  the  system. 

func  SIs-a-group  (g):  Bool;  -  -  Checks  if  a  specific  group  exists  in  the  system. 

func  $Has-a-right  (g,o,a):  Bool;  -  -  Checks  if  a  group  has  a  specific  explicit  access  right. 

proc  SDisplayjnember;  -  -  Displays  all  the  (user,  group)  pairs. 

end 


Specifications  for  the  operations  of  these  classes  are  given  in  detail  in  [10].  We  show  here  two  cases  as 
illustration. 

proc  Join-group  (u,g); 

-  -  Adds  a  user  to  a  group, 
begin 

if  not  Is-a-group  (g) 

then  return  error  -  group  does  not  exist 
else  if  User.Is-a-member  (u,g  ) 

then  return  error  -  user  u  is  already  a  member  of  g 
else  begin 

Add-member(if,  u) 
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end  Join^roup 


end  if 


proc  Add-right  (g,o,a); 

-  -  An  access  right  tuple  (o,  a)  is  added  to  a  group  rights  package 
begin 


if  not  Is-a-group  (g) 
then  return  error 
else  if  Has_a_right  {g,  o,  a) 

then  return  error  -  -  group  already  has  this  right, 
else  Authorization-rule  :=  Authorization-rule  U  (g,  o,  a) 

end  Add-right 


5  Policies  for  Structured  Groups 

5.1  Group  structuring 

In  an  institution  there  exist  more  complex  structures  than  just  groups.  For  example,  a  set  of 
departments  can  become  a  division  and  be  under  the  direction  of  a  single  manager.  Sometimes,  for 
work  reasons  it  is  necessary  to  collect  small  groups  of  employees  to  work  on  specific  projects.  This 
creates  the  need  to  have  groups  of  groups.  We  present  now  three  types  of  structurings  and  some 
of  the  corresponding  procedures  to  manipulate  them. 

As  seen  in  Section  2  there  are  three  basic  ways  to  associate  classes  in  the  object-oriented  model 
[19],  namely  generalization,  aggregation,  and  relationship.  By  analogy  these  three  associations  can 
be  used  as  a  basis  to  structure  groups  of  users  and  all  of  them  can  be  given  a  useful  meaning  for 
representing  the  institution  organization: 

•  A  generalization  structure  describes  structures  where  the  subgroups  perform  more  specialized 
operations  than  the  supergroups.  For  example.  Figure  2  shows  a  group  of  programmers 
(Programmer)  which  can  be  specialized  to  programmers  that  perform  more  specific  tasks, 
e.g..  System  Programmer,  Real-time  Programmer,  etc. 

•  A  composition  structure  describes  structures  that  represent  the  administrative  or  physical 
division  of  people.  For  example  Figure  3  shows  a  company  divided  into  three  groups  of 
employees  that  belong  to  different  departments. 

•  A  relationship  structure  describes  associations  between  groups  needed  to  perform  new  jobs. 
It  effectively  describes  a  new  group  formed  taking  people  from  two  (or  more)  existing  groups. 
For  example  in  Figure  4  some  real-time  programmers  and  some  processor  designers  are 
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Figure  2:  Generalization  structure 

assigned  to  work  in  a  project  called  Real-Time  Computer  defined  by  relationship  Real-tir>ie 
Computer.  In  forming  this  type  of  group  we  ignore  the  correspondence  between  set  elemestj 
indicated  by  a  relationship,  i.e.,  we  only  consider  the  presence  of  the  elements  themselves. 
We  also  describe  this  group  by  the  name  of  the  relationship;  a  more  consistent  representation 
from  a  notational  viewpoint  would  describe  this  relationship  as  another  class  [10],  but  we 
have  not  done  this. 

A  specific  user  may  belong  to  different  groups,  for  example,  an  individual  could  be  in  the  group 
of  application  programmers  and  in  the  group  Manufacturing.  In  general,  these  groupings  can 
represent  the  permanent  jobs  of  the  users  as  well  as  temporary  assignments.  Sometimes  both 
types  of  groups  may  coincide,  for  example  a  company  may  divide  its  departments  according  to 
specialty;  in  this  case,  in  the  example  of  Figure  2,  we  would  have  also  a  composition  structure. 
As  another  example.  Figure  5  shows  a  group  New_systems  which  is  made  up  of  three  groups: 
New_op_sys,  New_user Jf(interface),  and  New_reaLtime_comp,  where  each  component  group  is  a 
relationship  group  combining  designers  of  different  specialities.  Here  New_systems  would  be  the 
group  including  all  designers  working  in  new  systems  development. 

The  principle  of  “need  to  know”  is  fundamental  to  design  a  secure  system  [9].  This  princip's 
establishes  that  users  should  be  given  just  enough  rights  to  perform  their  duties.  Groups  can  reflect 
the  structure  of  the  functional  tasks  to  be  performed  and  the  policies  of  the  institution.  If  they  are 
the  only  way  to  acquire  rights,  by  controlling  access  to  groups  it  is  possible  to  enforce  institution 
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Figure  4:  Relationship  structure 
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security  policies  in  a  simple  way.  Consequently  we  can  define  the  following  group  policies  (GP): 

•  Policy  GP\:  Users  in  subgroups  of  generalization  structures  inherit  all  the  rights  from  their 
supergroups. 

•  Policy  GP2'  Users  in  supergroups  of  composition  structures  acquire  all  the  rights  of  their 
subgroups. 

•  Policy  GP3:  Users  in  relationship  structures  bring  to  the  new  group  their  own  group  rights 
and  acquire  the  specific  rights  implied  by  the  needs  of  the  new  group. 

We  use  the  following  notation:  rights  is  a  set  of  explicit  rights,  i.e.  rights  that  are  explicitly 
associated  with  the  group  under  consideration;  righti  is  a  set  of  implicit  rights,  i.e.  rights  that  can 
be  derived  from  the  explicit  rights  using  one  of  the  above  three  policies.  G  =  {91,92,  ■■■,  9k}  is  a 
generalization  of  k  groups,  C  =  {91,92,  ■■;9k}  is  a  composition  of  k  groups,  R  =  {91,92,  ■•■,9k}  is  a 
relationship  group  among  k  groups.  A  group  9  may  have  explicit  or  implicit  rights,  i.e.: 

right(g)  =  rights{9)  U  righti{g) 

We  express  these  policies  formally  as: 

GP\  :  9i  €  G  Ar  £  right(G)  —*  r  £  righti{g) 

GP2  '■  9i  €  C  Ar  £  righting)  —yr£  righti{C) 

GPz  :  Ui  £  R  At  £  right{R)  —>■  r  £  righti{ui) 

Note  that  a  relationship  group  is  not  strictly  a  group  of  groups  as  the  generalization  and  composition 
groups.  We  can  visualize  a  relationship  group  as  a  group  formed  taking  specific  users  from  two  or 
more  existing  groups  and  collecting  them  into  a  new  group. 

Policy  GP\  is  justified  because  subgroups  define  more  specialized  groups.  In  this  case  the  mem¬ 
bers  of  the  subgroups  should  have  the  rights  of  their  supergroups  and  in  addition  the  rights  needed 
to  perform  more  specialized  functions.  For  example,  a  Real-time  Programmer  is  a  programmer  and 
should  possess  all  the  rights  needed  by  programmers  to  perform  their  functions.  Additionally,  a 
Real-time  Programmer  needs  access  to  specialized  tools. 

Policy  GP2  makes  sense  to  describe  data  access.  For  example  the  group  that  develops  programs 
for  the  whole  company  should  have  access  to  the  programs  developed  by  the  individual  departments. 
Another  justification  comes  from  seeing  the  subgroups  as  the  result  of  dividing  an  existing  group; 
in  this  case  the  large  group  is  a  way  to  refer  to  the  set  of  subgroups  and  includes  all  of  their 
functions  and  consequently  all  of  their  rights.  This  policy  describes  the  hierarchical  structure  of 
most  institutions  where  a  high-level  group  has  more  power  and  rights  than  a  lower-level  group. 
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Figure  6:  Group  of  groups  graphs. 

Finally,  Policy  GP^  would  be  useful  to  describe  projects  which  require  the  combination  of 
people  with  different  specialties  or  from  diverse  departments.  Because  their  function  is  specific  the 
members  of  this  type  of  group  should  have  project-specific  rights  in  addition  to  their  other  group 
rights  based  on  their  specialty  or  the  department  to  which  they  belong. 

As  we  said  earlier,  groups  are  very  important  for  efficiency  because  there  is  no  need  to  write 
individual  authorization  rules  defining  the  rights  of  each  user;  if  a  user  can  join  a  given  greuo 
because  of  her  functions  she  automatically  acquires  all  the  group  rights.  Groups  of  groups  aiio? 
to  reduce  the  number  of  explicit  rules  even  further  by  taking  advantage  of  the  inherent  hierarchies 
present  in  most  institutions.  Negative  authorization  rules  and  predicates  can  be  used  to  provide 
a  more  precise  control  of  access  (although  we  do  not  show  that  in  this  paper).  Section  7  shows  a 
detailed  example  of  the  application  of  these  policies. 

5.2  Procedures 


We  add  now  a  set  of  procedures  to  manipulate  groups  of  groups.  All  these  procedures  assume  that 
the  structure  of  a  group  of  groups  is  described  by  a  class  GROUP_OF_GROUPS  which  includes  the 
interconnection  graph  of  the  groups.  We  also  assume  that  each  group  G  in  GROUP-OF_GROUPS 
can  include  only  one  type  of  association,  i.e.,  G  can  be  a  generalization,  aggregation,  or  relationship 
group  of  groups.  This  is  not  a  restriction  since  we  can  decompose  a  complex  group  into  homogeneous 
subgroups.  Notice  also  that  a  given  group  g  can  belong  to  any  number  of  groups  G  of  any  type. 
Figure  6  shows  this  situation;  here  GO  represents  an  aggregation  graph  and  so  on.  In  particular, 
generalization  and  aggregation  graphs  are  trees;  relationship  class  graphs  can  be  seen  as  2-leveI 
trees  if  the  relationships  are  binary  or  ternary.  The  class  representation  of  group  of  groups  could 
be  as  shown  below. 
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class  GROUP-OF.GROUPS  is 

G:  string;  -  identifier  for  group  of  groups  graph 

Children  (g  £G):  --  For  each  group  g  in  the  group  of  groups  this  data  structure  describes  its  chiidreii. 
Type:  {generalization,  aggregation,  relationship};  -The  three  types  of  group  graphs 
proc  Create.group^f-groups  (G,  t)  -  -  Creates  an  empty  group  of  groups  G  with  type  t. 
proc  Delete-group  (G,  g)  -  -  Deletes  group  g  from  G. 
proc  Add-group  (G,  gi,  g2) 

-  -  add  group  5^1  as  the  child  of  g^  in  G,  the  type  of  G  determines  the  type  of  association, 
proc  Ancestor  (G,  g)  -  -  Finds  the  ancestors  of  a  group  g  within  a  group  G. 
proc  Descendant  (G,  g)  -  -  Finds  the  descendants  of  a  group  g  within  a  group  G. 
proc  Related-group  (g)  -  -  Finds  the  related  groups  of  a  group  g  in  the  system. 

end 


As  before,  the  details  of  these  procedures  can  be  found  in  [10]. 


6  Evaluating  Authorization 


Access  requests  from  user  programs  or  query  languages  must  be  compared  with  the  authorization 
rules  to  decide  if  the  requested  access  is  legal.  This  is  normally  performed  by  some  evaluation 
algorithm  [9].  A  request  has  the  general  form  (s',  o',  a'),  where  s'  is  the  requesting  subject,  o'  is 
the  requested  data  item,  and  a'  the  intended  access  type.  Since  authorization  rules  have  groups  as 
subjects  if  we  start  from  a  request  from  user  u,  one  must  determine  first  the  effective  subject,  Ge//, 
considering  the  groups  to  which  he  might  belong.  The  data  item  can  be  a  class  or  an  attribute  of 
a  class.  The  access  type  may  be  a  method  or  read/write  actions  on  attributes.  In  other  words,  we 
are  trying  to  match  up  (g  €  Gg/f,  o',  a')  against  an  authorization  rule  of  the  form  (g,  o,  a)  explicit 
or  implicit. 

The  evaluation  algorithm  is  just  one  of  the  methods  (Check-rights)  in  the  authorization  model 
of  Figure  1.  To  find  the  effective  subject  we  need  to  determine  all  the  inherited  and  acquired  rights 
according  to  the  group  structuring  (Figure  7).  A  high  level  expression  of  the  evaluation  of  a  request 
is  shown  below  The  algorithm  accommodates  any  policy  with  respect  to  active  groups  by  adjusting 
the  function  Group-membership  to  return  either  only  one  group  or  a  set  of  groups: 

func  Check-rights  (s',  o',  a'):  Bool; 

-  -  s  comes  from  login,  o'  and  a'  are  defined  from  the  application  language  interface. 

-  -  Ge//  is  the  effective  group  of  s'  which  defines  the  subject  of  the  authorization  rule 
begin 

G  :=  USER. Group-membership  (s');  1 

-  -  If  we  use  the  policy  of  OR-ing  all  the  groups  to  which  a  user  belongs,  method  Group-membership 
returns  of  set  of  all  the  groups  to  which  a  user  belongs.  If  the  user  can  only  use  one  of  his  groups  the 
user  would  be  allowed  to  select  one  of  the  groups  in  this  set.  This  would  imply  to  add  here  an 


16 


Descendant  (g) 
(Gj  is  O) 


Figure  7:  Determination  of  effective  subject. 


-  -  interactive  step  to  receive  the  user’s  choice, 
if  G  <4 

then  begin 

Gane  ■—  U^eG  Ancestor  (g)] 

-  -  This  finds  all  the  ancestors  of  g  in  all  generalization  group  graphs  to  which  g  belongs. 

Gdese  ■=  UjgG  Descendant  (g); 

-  -  Finds  all  the  descendants  of  g  in  all  aggregation  group  graphs  to  which  g  belongs. 

Grei  ■=  UjgG  Related_group(G,  g); 

-  -  This  finds  all  groups  directly  related  to  g. 

Geff  ■■=  gU  Ganc  UGd  esc  U  Gr,,; 

-  -  Geff  is  used  as  subject  for  possible  access  rules  that  authorize  the  request 
if  (o',  a')  e  (o,  a)  of  {set  of  rules  with  g  €  Gg/j  as  subject} 

then  return  True 

else  return  False 
end  if 

else  return  False 
end  if 
end 

This  algorithm  works  as  follows: 

•  First  the  effective  subject  is  determined  by  considering  all  the  direct,  inherited,  and  acquired 
rights  for  subject  s'  (Figure  7). 

•  We  have  now  a  set  of  groups  that  are  the  possible  subjects  for  access  rules  authorizing  this 
request,  let  these  be  Geff  =  {91,92,93}- 

•  We  can  think  of  the  relationship  Rights  as  a  table  linking  subjects  (groups)  to  data.  We  then 
search  this  table  and  determine  the  rules  that  have  gi,  92,  or  93  as  subjects.  If  any  of  those 
rules  has  o  =  o'  as  security  object  we  check  in  the  relationship  table  the  corresponding  access 
type  a.  If  this  matches  a  the  request  is  authorized. 

One  should  note  here  that  this  approach  applies  to  any  type  of  class  structuring  for  the  protected 
data.  In  the  model  of  [7]  the  set  of  rules  that  have  Ggf  /  as  subject  are  determined  by  implication 
along  the  data  hierarchy  (implied  accesses  are  inherited  from  superclasses  or  subclasses)  while  in 
other  models  they  would  be  determined  in  other  ways.  For  example,  in  systems  without  implied 
accesses  one  needs  to  find  a  rule  that  matches  exactly  the  requested  data  unit,  i.e.,  0  =  o'.  Another 
important  issue  is  the  meaning  of  G;  it  could  be  interpreted  as  the  set  of  all  the  groups  to  which  u' 
belongs  (as  shown  in  the  algorithm)  or  as  one  of  these  selected  by  the  user  [13]  (See  discussion  in 
Section  3).  The  algorithm  is  also  general  in  this  sense,  in  a  specific  implementation  one  could  adopt 
either  policy.  For  example,  in  certain  applications  the  structure  of  a  program  under  development 
can  be  modeled  very  conveniently  by  a  class  hierarchy  describing  its  components.  Versions  of  a 
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Figure  8:  A  grouping  of  programmers. 

program  can  be  described  by  relationship  associations  between  programs.  This  approach  provides 
then  a  unified  view  of  the  complete  system,  as  well  as  an  easily  implementable  design. 

Two  aspects  must  be  considered  in  the  implementation  of  this  algorithm: 

•  Complexity.  In  general  the  groups  of  groups  for  a  real  application  will  be  simple,  with  at 
most  three  to  four  levels.  This  implies  that  the  propagation  of  access  rights  is  rather  short. 
To  improve  efficiency  even  more  the  effective  subject  can  be  determined  at  login  time  or  veil 
at  compile  time. 

♦  Graph  structure.  If  groups  are  defined  without  control,  loops  may  occur.  Because  of  i 
recursive  nature  of  the  algorithm  these  must  be  avoided.  The  Add_group  method  should 
check  for  possible  loops. 


7  Application  of  the  proposed  approach 


We  will  show  the  practical  value  of  our  proposal  through  a  detailed  application  example.  We 
consider  a  software  development  environment  because  it  requires  a  rich  variety  of  authorization 
policies  [4]. 

We  assume  a  typical  software  development  company  and  we  see  how  logical  actions  of  its 
operation  can  be  implemented  by  specific  commands  from  the  set  of  group  primitives  described 
above. 


1.  Assume  the  initial  state  of  the  development  system  is  described  by  the  graph  of  Figure 
8.  Note  that  Programming  is  the  groups  of  groups’  name,  that  is,  up  to  this  point  the 
operation  of  the  system  is  such  that  only  one  grouping  with  two  levels  is  necessary,  i.e.,  we  ci  J  v 
have  programmers  and  they  are  classified  in  several  types  requiring  different  types  of  tools. 
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Figure  9:  The  company  after  step  3. 

Programmers  have  access  to  standard  compilers  and  editors,  while  real-time  programmers 
for  example,  can  also  access  real-time  simulators,  a  Petri-Net  specification  tool,  and  other 
specialized  tools. 

2.  Mary  Jones  (an  application  programmer)  and  Eduardo  Lu  (a  systems  and  real-time  program¬ 
mer)  join  the  company  and  are  assigned  to  their  corresponding  groups  through  the  commands: 

Join-group  (M.  Jones,  Application-Programmer) 

Join-group  (E.  Lu,  Systems-Programmer) 

Join-group  (E.  Lu,  Real-time-Programmer) 

Now,  for  example,  M.  Jones  can  use  the  specialized  tools  needed  by  application  programmers. 
She  inherits  rights  to  access  the  standard  tools  needed  by  all  programmers  as  well. 

3.  Now  the  company  decides  to  go  into  the  high-security  systems  market  and  they  hire  several 
programmers  that  are  specialists  in  secure  systems.  In  order  to  incorporate  these  specialists 
in  the  system  we  perform  the  following  actions; 

a)  Create  a  new  group  of  security  programmers: 

Create-group  (Security -Programmer) 

b)  Add  to  this  group  the  new  k  programmers  just  hired: 

Join-group  (name-1,  Security -Programmer) 

Join-group  (name-k.  Security -Programmer) 

c)  Make  this  group  a  generalization  subgroup  of  Programmer: 

Add-group(Programming,  Security -Programmer,  Programmer) 

Now  the  company  structure  looks  as  shown  in  Figure  9. 

4.  The  market  for  applications  programming  is  poor  and  the  company  decides  to  abandon  this 
field.  The  members  of  this  group  will  be  reassigned  or  laid  off. 
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Figure  10:  The  company  after  step  4. 

a)  Group  ApplicationJProgrammers  is  deleted: 

Delete_group  (Programming,  Application-Programmer) 

b)  Some  Application  Programmers,  e.g.,  R.  Johnson,  are  laid  off: 

Leave-group  (R.  Johnson,  Application-Programmer) 

R.  Johnson  is  also  deleted  from  class  User. 

c)  Others  are  just  reassigned: 

Join-group  (M.  Jones,  Systems-Programmer) 

The  company  now  looks  as  shown  in  Figure  10. 

5.  Management  realizes  that  it  is  impossible  to  develop  good  systems  without  hardware  special¬ 
ists  and  hires  several  of  these.  They  are  assigned  to  three  specialization  groups:  proces.sor 
designers,  I/O  system  designers,  and  CRT  designers.  The  new  state  of  the  company  is  shown 
in  Figure  11. 

6.  This  organization  works  reasonably  well  because  the  users  have  the  rights  they  need  to  use 
their  tools  and  the  systems  they  are  developing.  It  assumes  that  a  project  is  divided  into  the 
different  specialties  according  to  their  abilities.  If  there  is  only  one  project  under  development 
that  is  all  we  need.  However,  we  want  more  flexibility.  What  if  we  need  to  select  some 
software  and  hardware  designers  to  work  in  a  high-priority  project,  e.g.,  a  new  real-time 
computer?  This  is  the  reason  for  the  relationship  groups  that  we  have  introduced.  Note  that 
an  aggregation  group  will  not  do:  an  aggregation  group  including  for  example  the  group  Real¬ 
time-Programmer  and  Processor-Designer  will  select  all  of  these  people  not  just  a  specific 
set  of  them.  We  define  then  a  relationship  group:  Relate-group  (New -real -t-comp,  Real 
timeJProgrammer,  Processor -Designer).  We  then  assign  rights  to  this  group  according  to  itc 
intended  function.  Then  we  can  indicate  its  members:  Join-group  (E.  Lu,  New-real-t-comp  V 
etc.  The  new  state  of  the  company  at  this  stage  is  shown  in  Figure  12. 


21 


Figure  11;  The  company  after  step  5. 


Programming  Hardware 


Figure  12:  The  company  after  step  6. 
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8  Conclusions 


The  main  contributions  of  this  work  are: 

•  A  basis  for  unstructured  groups  of  users  including  a  set  of  procedures  to  manipulate  them  (  a. 
more  detailed  set  of  procedures  can  be  found  in  [10]). 

•  A  framework  for  structured  groups  of  groups  including  the  corresponding  procedures  for  their 
handling.  A  detailed  comparison  with  other  approaches  can  be  found  in  [10]. 

•  A  new  concept  to  structure  groups  of  groups,  the  relationship  group,  which  had  not  been 
proposed  in  any  previous  system  and  which  is  useful  to  define  some  types  of  security  policies, 
i.e.,  it  enhances  the  precision  of  the  authorization  system. 

•  A  new  set  of  authorization  policies  for  groups  of  groups  which  can  reflect  the  least  privilege 
policy  into  the  institution  group  structure. 

•  The  formulation  of  the  authorization  system  itself  in  terms  of  classes  and  associations  between 
these  classes.  This  provides  unity  to  the  complete  system  design  in  that  all  of  these  models 
can  be  shown  to  be  special  cases  of  ours.  Our  approach  also  provides  a  direct  basis  f. 
implementation.  More  importantly,  it  allows  the  authorization  system  to  protect  itself.  TMs 
had  not  been  done  for  object-oriented  databases;  in  fact  other  models,  e.g.,  [13],  use  k 

level  primitives  for  this  purpose.  This  also  results  in  simpler  algorithms  in  the  authorizatioa 
system. 

•  The  application  of  the  proposed  group  structures  to  the  evaluation  of  access  requests.  This 
can  significantly  improve  the  efficiency  of  the  authorization  system.  A  specification  of  the 
necessary  algorithm  is  presented.  This  algorithm  can  be  combined  with  evaluation  algorithms 
based  on  data  structuring  [7]. 

In  fact,  although  all  these  concepts  were  developed  with  an  object-oriented  database  in  mind, 
they  can  be  applied  to  other  types  of  databases.  Further,  they  can  also  be  applied  to  operating 

systems.  They  are  particularly  useful  for  distributed  environments  where  there  are  many  users  with 
similar  roles. 
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Abstract 

Over  the  past  three  years,  our  work  has  explored  the  attainment  of  user-role  based 
security  ( URBS)  for  discretionary  access  control  (DAC)  within  an  object-oriented  design 
modeh  Our  approach  has  extended  the  public  interface  (which  defines  the  means  for 
accessing  classes  or  object  types)  to  allow  its  methods  to  be  selectively  assignable  (or 
prohibited)  on  a  role-by-role  basis.  This  allows  different  users  at  different  times  to  have 
particular  access  to  the  public  interface  based  on  their  specific  roles. 

The  work  presented  in  this  paper  details  our  prototyping  efforts  as  we  transition 
from  re^arch  on  URBS  and  DAC  to  the  object-oriented  design  and  analyses  environ- 
ment  ADAM.  ADAM,  short  for  Active  Design  and  Analyses  Modeling,  is  a  language- 
mdependent  environment  that  automatically  generates  compilable  code  in  C+-h,  Ontos 
C+-f  Ada83,  or  Ada9X  (object-oriented  extension  to  Ada)  Rom  designs  that  have  been 
supphed  via  text  and  form-based  input.  ADAM  unifies  the  structural  and  security  re¬ 
quirements,  elevating  the  latter  to  a  first-class  citizen  in  the  design  process.  ADAM  also 
offers  a  framework  of  analysis  techniques  that  are  intended  to  support  a  more  precise 
and  accurate  characterization  of  an  application  and  its  security  requirements.  Note  that 
we  also  present  recent  research  results  on  the  support  for  integrity  constraints  within 
our  model  and  ADAM,  and  discuss  their  potential  impact  on  security  considerations. 


1  Introduction 

Over  the  past  three  years,  our  research  emphasis  has  concentrated  on  the  investigation 
and  attainment  of  discretionary  access  control  (DAC)  within  an  environment  that  sup¬ 
ports  object-oriented  design  and  offers  user-role  based  security  (URBS)  [2,10,19]  as  an 
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equal  partner  In  the  development  process.  URBS  was  proposed  in  1988  [13,18]  as  a  tech¬ 
nique  which  focuses  on  individuals  by  characterizing  and  defining  their  responsibilities 
within  the  application  as  the  means  for  identifying  and  establishing  security  privileges. 
We  have  chosen  the  object-oriented  paradigm,  since  it  has  drawn  much  interest  in  recent 
years,  in  academia,  industry,  and  government,  and  appears  to  offer  unique  capabilities 
that  promote  the  design  process  (e.g.,  encapsulation  and  hiding  -  public  interface  vs. 
private  implementation)  and  facilitate  software  evolution  (e.g.,  representation  indepen¬ 
dence  and  inheritance).  Existing  support  for  DAG  in  the  object-oriented  approach  is 
limited  to  the  public  interface  (i.e.,  the  set  of  all  visible  methods,  their  parameters,  and 
their  return  types  for  each  object  type  or  class),  where  all  users,  regardless  of  their  needs 
within  the  application,  have  full  access  to  all  methods  in  the  public  interface. 

Our  research  efforts  have  sought  to  customize  access  to  the  public  interface,  to  allow 
different  individuals  to  have  particular  access  to  specific  subsets  of  the  public  interface  at 
different  times  via  their  roles  within  the  application.  Our  approach  [2,10,19]  has  focused 
on  establishing  privileges  by  assigning  (positive)  and  prohibiting  (negative)  methods 
based  on  roles.  A  role  assigned  a  method  can  invoke  the  method,  and  by  inference  can 
also  access  instances  on  the  object  type  on  which  the  method  is  defined,  and  potentially 
read  and  modify  private  data  that  the  method  might  use.  A  prohibited  method  restricts 
access  in  a  similar  fashion.  The  idea  of  defining  both  assigned  and  prohibited  methods 
is  important,  since  it  allows  the  possible  problem  of  information  leakage  [16]  due  to  the 
inheritance  between  object  types  to  be  addressed. 

There  have  been  a  number  of  other  efforts  related  to  our  work.  The  approach 
in  [14]  is  similar  to  our  process  of  method  assignment,  but  differs  since  they  assign 
objects  and  authorization  types  to  roles.  Our  approach  also  contrasts  to  [13],  where  the 
access  rights/permitted  roles  are  assigned  based  on  data  levels.  Our  approach  and  [15] 
both  have  the  goal  of  providing  different  interfaces  to  different  users,  but  differ  since 
they  assign  views  based  on  data.  When  considering  negative  privileges,  our  concept  of 
prohibited  methods  is  similar  to  the  concepts  of  negative  authorization  in  [14],  denied 
roles  in  [13],  and  permission  tags  in  [1],  but  differs  since  all  of  their  efforts  emphasize 
data;  we  focus  on  object  types/methods.  Other  efforts  for  security  in  object-oriented 
systems  via  mandatory  access  control  [12,17]  differ  from  our  DAC  approach. 

Our  purpose  in  this  paper  is  to  report  on  the  status  of  our  prototyping  efforts  for 
supporting  object-oriented  design  that  includes  URBS  definition.  Ongoing  work  has 
resulted  in  the  development  of  an  object-oriented  environment,  ADAM  (short  for  Active 
Design  and  Analyses  Modeling),  that  is  capable  of  supporting  both  the  design  process 
and  generating  compilable  code  in  multiple  languages  [6,7,8].  ADAM  currently  supports 
code  generation  for  two  dialects  of  C-f-f  (GNU  C-f-f-  and  Ontos  C-f+  -  an  object- 
oriented  database  system),  Adal983  [4],  and  Ada9X  [5].  This  paper  reports  on  the 
incorporation  of  a  portion  of  our  previous  security  research  into  the  ADAM  environment, 
focusing  on  the  definition  of  security  privileges  on  user  roles  [10]. 

Our  approach  also  contains  a  framework  of  analysis  techniques  [2,10,19]  that  operate 
from  two  perspectives  to  indicate:  which  user  roles  have  access  to  a  specific  aspect  (OT, 
method,  private  data)  of  an  application;  and,  what  is  the  access  to  the  application  of 
a  chosen  user  role.  Analyses  are  supported  in  two  different  ways  within  ADAM.  First, 
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as  privileges  are  being  defined  for  different  user  roles,  they  are  automatically  checked 
against  all  existing  privileges  for  that  role  to  identify  conflicts  or  inconsistencies  in  real¬ 
time.  Second,  at  any  time  during  the  overall  process,  a  security  designer  can  initiate  a 
number  of  analyses  that  allow  him(her)  to  understand  and  evaluate  the 
defined  privileges  against  the  desired  security  requirements  of  the  application.  These 
analyses  can  be  used  by  the  designer  to  investigate  realized  privileges  against  the  appli¬ 
cation  s  intended  security  requirements.  This  paper  will  discuss  the  ADAM  prototype, 
with  an  emphasis  on  its  support  for  security  definition  and  analyses. 

The  remainder  of  this  paper  is  organized  into  four  sections.  In  Section  2,  we  de¬ 
scribe  the  object-oriented  model  and  URBS  definition  capabilities  of  ADAM,  using  a 
health  care  application  (HCA).  Section  3  reviews  the  available  analysis  framework  that 
allows  the  desiper  to  understand  and  evaluate  the  application  from  complementary 
perspectives,  with  the  goal  to  support  a  more  precise  characterization  of  an  application. 
In  Section  4,  we  detail  ongoing  research  and  prototyping  issues,  including  an  extension 
to  the  object-oriented  model  and  ADAM  for  supporting  integrity  constraints,  and  a 
discussion  of  next-step  research  concepts  for  security  enforcement.  Finally,  Section  5 
summarizes  the  paper  and  indicates  future  plans. 


2  Object-Oriented  Design,  URBS,  and  ADAM 

The  object-oriented  design  model  for  ADAM  is  tightly  integrated  into  the  environment, 
with  the  semantic,  scope,  content,  and  context  of  each  modeling  construct  clearly  de¬ 
fined.  There  is  no  specific  syntax  for  the  design  model;  choices  are  made  via  menus, 
browsers,  etc.,  and  text  is  directly  entered  by  the  designer  using  forms.  Thus,  the  envi¬ 
ronment  stresses  language  independence  by  focusing  on  design  and  allowing  code  to  be 
generated  in  a  variety  of  target  languages  for  supporting  a  transition  to  the  implemen¬ 
tation  effort.  ADAM  supports  incremental  design  by  allowing  design  data  to  be  stored 
persistently  in  the  Ontos  database  system.  Design- analyses  are  promoted  through  pro¬ 
files  [8,10],  which  are  detailed  requirements  on  the  semantic  content  and  context  for  all 
constructs  of  the  application.  Profiles  have  two  purposes: 

1.  Force  software  engineers  to  supply  detailed  information  as  an  application  is  de¬ 
signed. 

2.  Provide  on-demand  and  automatic  analyses  for  feedback  to  software  engineers 

whenever  an  action  in  the  environment  results  in  a  conflict  or  possible  inconsis¬ 
tency. 

To  support  the  entire  design,  development,  and  analyses  processes,  the  ADAM  environ¬ 
ment  has  been  partitioned  into  two  stages:  design  phases  and  semantic  perspectives. 

Design  phases  (DPs)  are  intended  to  be  used  for  constructing  the  application  struc- 
ture  and  behavior  by  segmenting  the  design  process  into  logical  parts.  That  is,  through 
DPs,  software  engineers  can  define,  modify,  and  evolve  applications,  including  both  its 
information  and  security  characteristics.  Throughout  the  design  process,  profiles  are 
entered  by  software  engineers  and,  in  some  cases,  this  data  is  propagated  throughout 
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the  entire  environment.  As  software  engineers  develop  and  detail  their  designs,  DPs  also 
offer  automatic  feedback  as  discussed  above.  At  any  point  during  the  design  process, 
code  can  be  generated  so  that  designers  may  inspect  whether  generated  code  meets  their 
needs  and  requirements. 

Semantic  perspectives  (SPs)  are  the  stage  in  ADAM  that  can  be  used  by  software 
engineers  to  analyze  structure  and  behavior  by  examining  different  context  views  of  an 
application’s  content.  By  understanding  the  semantic  associations  through  feedback 
based  on  information,  methods,  object  types,  and  user  roles,  software  engineers  can 
refine  and  redefine  their  designs.  The  analyses  supported  via  SPs  are  accomplished 
using  profiles.  Hov/ever,  the  SPs  are  a  “read-only”  world;  to  correct  problems,  changes 
must  be  made  in  the  DPs. 

The  current  implementation  of  ADAM  (as  of  July  1,  1994),  supports  five  design 
phases  (three  for  modeling  constructs  and  two  for  URBS  and  authorization)  and  two 
semantic  perspectives  (for  object  types  and  URBS).  The  remainder  of  this  section  dis¬ 
cusses  four  of  the  five  design  phases  and  in  the  process  reviews  the  major  concepts  using 
examples  from  a  health  care  application  (HCA)  [10].  Note  that  ADAM  has  been  imple¬ 
mented  on  a  Sun  architecture  under  a  Unix  environment  using  X  windows,  InterViews 
3.01,  and  AT&T  C++  2.0. 

2.1  Object-Type  Specification 

In  the  object-type-specification  phase  of  ADAM,  the  designer  can  define  object  types 
(OTs),  attributes,  and  methods  through  associated  profiles.  This  work  has  been  ex¬ 
plored  elsewhere  [8],  and  is  only  briefly  reviewed.  An  attribute  profile  (AP)  includes: 

1.  a  prose  attribute  description  on  the  purpose  of  the  attribute  in  the  OT 

2.  the  name  and  type  of  the  attribute 

3.  a  list  of  the  methods  that  access  the  attribute  and  an  indication  of  whether 
the  attribute  is  read  and/or  written  by  each  method 

A  method  profile  (MP)  includes  the  following  information: 

1.  a  prose  method  description  for  the  method’s  actions  within  the  application 

2.  the  method’s  name,  return  type,  and  parameter  list  (with  names/types) 

3.  the  read/write  set  for  each  private  attribute  used  by  the  method  Mi 

4.  the  other  methods  that  are  called  by  M,-  to  accomplish  its  task 

An  OT  profile  (OTP)  consists  of  following  information: 

1.  a  prose  OT- description  for  the  purpose  of  the  OT  in  the  application 

2.  the  OT’s  name 

3.  the  persistency  status  of  the  OT 

4.  the  attribute  profiles  for  all  private  attributes 

5.  the  method  profiles  for  all  methods 
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Figure  1:  Object  Types  of  a  Health  Care  Application  (HCA). 


6.  the  relationships  that  involve  the  OT 

7.  the  supertypes  and  subtypes  of  the  OT 

A  subset  of  OTs  of  a  HCA,  based  on  prior  work  [10],  is  shown  in  Figure  1.  In 
this  figure,  there  are  ten  OTs:  Object,  Item,  Record,  Visit,  Prescription,  Test, 
Medical_R,  Prescription_R,  Financial_R,  and  Patient.  Visit,  Prescription,  and 
Test  are  subtypes  of  Item,  for  different  medical  procedures  for  a  patient.  The  profiles 
for  OT  Medical_R,  attribute  Medical_History,  and  method  Read_Med_Rec  are  shown 
in  Figure  2.  Note  that  some  information  in  profiles  is  designer-supplied  (e.g.,  each  MP) 
while  for  others,  input/choices  by  the  designer  automatically  update  relevant  profiles 
(e.g.,  each  MP/AP  is  included  in  OTP). 

2.2  Relationship-Type  Specification 

In  the  second  phase  of  ADAM,  the  designer  can  define  the  relationship  types  (RTs) 
between  different  OTs  via  a  relationship -type  profile  (RTP).  An  RTP  is  a  specialized 
OTP  that  contains: 

1.  all  information  from  an  OT  profile 

2.  the  relationship  variant  (e.g.,  one-to-one,  one-to-many,  set,  etc.) 

3.  the  involved  source  and  destination  OTs 
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Figure  2:  Profiles  for  Medical_R,  MedicalJIistory,  and  ReadJ-IedJlec . 


Note  that  except  for  relationship  name  and  description,  all  other  information  in  the 
profile  is  automatically  generated  based  on  a  designer’s  actions.  A  partial  subset  of 
the  relationships  between  the  OTs  given  in  Figure  1  is  shown  in  Figure  3.  In  this 
figure,  we  have  established  associations  between  the  different  OTs.  For  example,  a 
Medical_R(ecord)  contains  the  Visits,  Prescriptions,  and  Tests  that  catalog  the 
complete  medical  history  of  the  Patient. 


2.3  URDH  Specification 

To  support  URBS,  the  user-role  definition  hierarchy  (URDH)  characterizes  the  differ¬ 
ent  kinds  of  individuals  (and  groups)  who  all  require  different  levels  of  access  to  an 
application.  The  responsibilities  of  individuals  are  divided  into  three  distinct  levels  of 
abstraction  for  the  URDH:  user  roles,  user  types,  and  user  classes.  User  roles  allow  the 
security  software  engineer  to  assign  particular  privileges  to  individual  roles.  To  repre¬ 
sent  common  responsibilities  among  user  roles,  a  user  type  can  be  defined.  Privileges 
that  are  assigned  to  a  user  type  are  systematically  passed  to  all  of  its  roles.  The  different 
user  types  of  an  application  can  be  grouped  in  to  one  or  more  user  classes.  Privileges 
that  are  supplied  to  each  class  are  passed  on  to  its  types  and  their  roles. 

Figure  4  shows  a  partial  URDH  created  in  the  ADAM  environment  for  the  HCA, 
with  a  more  complete  URDH  given  elsewhere  [10].  In  the  figure,  the  roles  are  defined  in 
a  two-step  process  of  specialization  (top-down)  and  generalization  (bottom-up).  From 
a  top-down  perspective  in  Figure  4,  there  are  two  different  user  types:  Nurse  and 
Physician.  In  this  case,  the  software  engineer  is  assuming  that  each  of  these  user  types 
may  have  privileges  that  would  be  common  to  all  user  roles  under  the  type.  Within 
each  user  type,  one  or  more  user  roles  may  be  defined.  For  example,  in  Figure  4, 
user  roles  for  Nurse  include  Staff  JtN,  Discharge_Plng  (planning).  Education,  and 
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Figure  3:  A  Subset  of  the  Possible  Relationships. 


Manager.  The  URDH  can  also  be  examined  from  a  bottom-up  perspective  to  determine 
the  common  characteristics  by  the  grouping  of  the  user  types  into  user  classes  such  as 
Medical_Staf f ,  Support_Staff ,  and  Other,  which  is  not  shown  in  this  figure. 

To  more  accurately  characterize  the  capabilities  of  user  classes,  user  types,  and  user 
roles  in  the  URDH,  with  respect  to  the  privileges  to  be  granted  against  the  application, 
we  propose  the  creation  of  a  node  profile  (NP).  A  node  profile  contains: 

1.  a  name  for  the  node  (user  role,  user  type,  or  user  class) 

2.  a  prose  description  of  its  responsibility 

3.  a  set  of  assigned  methods  (the  positive  privileges) 

4.  a  set  of  prohibited  methods  (the  negative  privileges) 

5.  a  set  of  criteria  for  relating  URDH  nodes 

A  user-class  profile  or  user-type  profile  is  a  speciahzed  node  profile.  A  user-role  pro¬ 
file  is  a  specialized  node  profile  that  also  contains  a  prose  description  of  its  security 
requirements. 

In  the  URDH-specification  phase  of  ADAM,  the  designer  must  select  the  node  type 
(user  role,  user  type,  user  class)  to  define  a  new  URDH  node  as  shown  in  Figure  5. 
After  selecting  the  type  of  a  node,  the  designer  must  supply  the  node  name  and  node 
description  for  the  created  node.  The  initial  information  for  the  node  profile  of  the  user 
role  StaffJlN  is  shown  in  Figure  6.  After  a  node  is  created,  the  designer  can  select 
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Figure  4:  The  URDH  of  the  HCA. 


Figure  5:  Selection  of  URDH  Nodes. 
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Figure  6:  Initial  Information  for  the  User  Role  Staff_RN. 
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Figure  7:  Selection  of  Assigned  Methods  for  the  User  Role  Staff  JIN. 


the  menu  option  AddProfile  to  supply  other  information  on  the  security  privileges  for 
a  node,  i.e.,  assigned  and  prohibited  methods,  and  consistency  criteria.  The  designer 
utilizes  the  mouse  to  select  the  assigned/prohibited  methods  from  a  list  of  previously 
defined  methods.  To  specify  the  equivalence/subsumption  criteria,  the  designer  utilizes 
the  mouse  to  select  nodes  from  the  list  of  defined  URDH  nodes.  A  selection  of  assigned 
methods  for  user  role  Staff  JIN  is  shown  in  Figure  7. 

In  the  URDH-specification  phase  of  ADAM,  the  checking  on  assigned/prohibited 
methods  is  performed  automatically  to  insure  that  there  are  no  conflicts,  e.g.,  an  as- 
signed  method  conflicts  with  an  earlier  prohibited  method.  If  a  problem  is  identified  by 
the  analyses,  the  system  will  not  accept  the  specification  and  will  require  a  correction 
by  the  designer,  as  shown  in  Figure  8.  Note  that  the  conflict  may  be  more  subtle, 
and  arise  due  to  nested  method  calls,  e.g.,  one  assigned  method  calls  a  method  that 
calls  a  method  that  is  prohibited.  The  checking  on  consistency  criteria  is  also  per¬ 
formed  automatically  to  insure  that  there  are  no  conflicts  when  designers  specify  the 
assigned/prohibited  methods  and/or  the  consistency  criteria,  as  shown  in  Figure  9. 

The  complete  node  profiles  for  the  user  role  Staff  JIN  and  Manager  are  shown  in 
Figure  10.  The  node  description  and  node  security  requirements  for  Staff  JIN  were: 

Node  Description:  Administer  direct  care  to  patients  and  implement  the  physician 
treatment  plan. 

Security  Requirement:  All  clinical  information  for  the  patients  that  they  are  respon- 
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Figure  8:  Conflict  Identification  Message. 


Figure  9:  Consistency  Criteria  Checking  Message. 
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Figure  10:  Node  Profiles  of  the  User  Roles  Staff  _RN  and  Manager. 


sible  for  (referred  to  subsequently  as  clinical  info.).  Can  write/modify  a  substan¬ 
tial  portion  of  clinical  information  to  record  the  results/patient  progress.  Cannot 
change  a  Physician’s  orders  on  a  patient. 

Other  node  profiles  for  the  user  type  Nurse  and  user  class  Medical_Staf  f  are  given  in 
Figure  11. 


2.4  Authorization-List  Specification 

To  more  accurately  characterize  the  capabilities  of  users  in  an  application,  with  respect 
to  the  privileges  to  be  granted,  we  employ  user  profiles,  which  are  similar  in  concept  to 
node  profiles  (see  Section  2.3  again).  A  user  profile  (UP)  contains: 

1.  a  name  for  the  user 

2.  a  prose  description  of  its  responsibility 

3.  a  prose  description  of  its  security  requirements. 

4.  a  set  of  assigned  roles  (the  positive  privileges) 

5.  a  set  of  prohibited  methods  (the  negative  privileges) 

6.  a  set  of  criteria  for  relating  users 

In  the  authorization-list-specification  phase  of  ADAM,  the  designer  must  supply  the 
user  name,  user  description,  and  user  security  requirements  when  creating  a  new  user. 
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Figure  11:  Node  Profiles  of  the  User  Type  Nurse  and  the  User  Class  Medical_Staf  f . 


Since  all  of  the  actions  performed  in  this  phase  are  very  similar  to  the  URDU  phase,  we 
omit  bit  maps.  However,  note  that  conceptually,  a  user  has  privileges  via  a  set  of  one 
or  more  assigned  roles.  So,  the  URDH  information  is  aggregated  for  each  role  for  which 
a  user  has  been  authorized  (or  prohibited). 


3  Security  Analyses 

To  provide  the  designer  with  the  ability  to  compare/ contrast  the  privileges  which  have 
been  defined  (via  URDH  and/or  authorization  list)  with  the  application’s  intended 
security  requirements,  an  analyses  framework  is  supported.  In  this  section,  we  briefly 
review  the  analyses  that  have  been  implemented  in  ADAM.  The  research  motivation  and 
algorithmic  techniques  for  these  analyses  were  reported  in  an  earber  work  [11].  Recall 
that  the  analyses  are  supported  in  the  semantic  perspectives  of  ADAM,  as  described 
in  Section  2.  We  provide  bit  maps  of  ADAM  to  demonstrate  a  select  subset  of  the 
supported  analyses. 


3.1  Analyses  of  User- Role  Definition  Hierarchy 

ADAM  supports  designer-initiated  analyses  of  the  URDH  in  two  categories: 

•  the  capabilities  of  the  URDH  node  based  on  the  assigned  and  prohibited  methods 
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•  the  authorization  analysis  of  the  application  based  on  the  assigned  and  prohibited 
methods 

In  the  first  category,  the  security  designer  chooses  a  URDH  node  (say  Staff  _RN)  and 
can  then  analyze  its  privileges  with  respect  to  OTs,  methods,  and  private  data  that 
can/cannot  be  accessed.  In  the  second  category,  an  aspect  (OT,  method,  attribute)  of 
the  application  is  selected  and  the  different  user  roles  that  have  access  are  supplied. 

AU  of  these  and  later  analyses  occur  at  direct  and  indirect  levels.  Direct  analyses 
inspect  only  the  explicit  privileges  that  have  been  established.  Indirect  analyses  search 
for  privileges  exhaustively,  since  methods  can  call  other  methods  (see  the  method  profile 
definition  in  Section  2.1).  These  nested  method  calls  are  important,  since  they  provide 
an  inferred  access  to  methods,  OTs,  and  private  data  they  may  not  have  been  intended 
by  the  security  designer.  Indirect  analyses  are  also  used  to  support  automatic  analyses 
as  discussed  in  Section  2.3.  Due  to  space  limitation,  we  wiU  omit  indirect  analyses  from 
our  remaining  discussion. 

3.1.1  Capabilities  Analyses 

Capabilities  analyses  allow  the  security  designer  to  review  the  permissions  given  to 
a  chosen  URDH  node  on  an  application’s  OTs,  methods,  and/or  private  data.  This 
review  can  occur  throughout  the  time  period  when  the  designer  is  defining  the  URDH 
and  establishing  assigned/prohibited  methods  for  its  nodes.  For  example,  the  designer 
can  choose  the  Staff  JIN  node  and  be  presented  with: 

•  all  methods  which  have  been  assigned  to  Staff  JIN  and  its  ancestors  (Niirse, 
Medical_Staff ,  and  Users); 

•  all  OTs  which  can  be  accessed  by  Staff  JIN,  since  each  assigned  method  belongs 
uniquely  to  a  single  OT;  and 

•  all  private  data  which  is  accessed  by  Staff  JIN,  since  each  assigned  method  uses 
private  data  in  a  read,  write,  or  read/ write  fashion. 

Analysis  is  also  available  for  the  prohibited  methods  to  find  what  cannot  be  accessed. 
For  example,  the  designer  can  choose  the  Staff  JIN  node  and  be  presented  with: 

•  all  methods  which  have  been  prohibited  to  Staff  JIN  and  its  ancestors  (Nurse, 
Medical_Staf f ,  and  Users); 

•  all  OTs  which  cannot  be  accessed  by  Staff  JIN,  since  each  prohibited  method 
belongs  uniquely  to  a  single  OT;  and 

t  all  private  data  which  is  not  accessed  by  Staff  JIN,  since  each  prohibited  method 
uses  private  data  in  a  read,  write,  or  read/ write  fashion. 

In  the  semantic  perspective  of  the  URDH,  a  list  of  available  analyses  can  be  enabled 
as  shown  in  Figure  12.  The  designer  can  select  any  option  from  the  list  and  perform  the 
desired  analyses.  If  the  “Direct  Assigned  Methods”  option  is  selected,  a  set  of  assigned 
methods  on  a  selected  node  will  be  returned.  The  results  of  the  direct  analysis  of 
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Figure  12:  Available  Capabilities  Analyses  of  the  URDH. 


assigned  methods  for  StaffJlN  is  shown  in  Figure  13.  Each  method  name  is  associated 
with  the  URDH  node  name,  so  that  the  designer  can  understand  where  the  methods 
have  been  assigned.  If  the  method  list  cannot  fit  into  one  window,  the  designer  can 
use  the  scroll  bar  to  check  other  methods.  If  the  designer  identifies  any  problem{s)  in 
method  assignments  (or  the  privileges)  as  a  result  of  the  analyses,  correction(s)  can  be 
made  by  modifying  the  URDH,  node  profiles,  or/and  the  application.  Direct  analyses  for 
user  types  work  in  a  similar  fashion  as  shown  for  Nurse  in  Figure  14.  Correspondingly, 
when  the  “Direct  Prohibited  Methods”  option  is  selected,  a  set  of  prohibited  methods 
on  a  selected  node  will  be  returned  as  indicated  for  StaffJlN  in  Figure  15. 

3.1.2  Authorization  Analyses 

Authorization  analyses  allow  the  designer  to  investigate  which  user  roles  have  what 
kinds  of  access  to  different  aspects  of  an  application  (i.e.,  an  OT,  a  method,  or  a  private 
data  item).  For  example,  the  designer  can  choose  the  MedicalJl  OT  and  be  presented 
with: 

•  all  user  roles  which  have  access  to  the  MedicalJl  OT; 

•  all  user  roles  which  have  access  to  the  methods  of  MedicalJl  OT;  and 

•  all  user  roles  which  have  access  to  the  private  data  items  of  MedicalJl  OT. 

The  authorization  analyses  of  prohibited  methods  are  similar  to  the  analyses  of  assigned 

methods.  For  example,  the  designer  can  choose  the  MedicalJl  OT  and  be  presented 
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Figure  13:  Direct  Analysis  on  Assigned  Methods  for  the  User  Role  Staff  JIN. 


Figure  14:  Direct  Analysis  on  Assigned  Methods  for  the  User  Type  Nurse. 


Figure  15:  Direct  Analysis  on  Prohibited  Methods  for  the  User  Role  Staff  JIN. 
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with  a  list  of  user  roles  that  should  not  be  allowed  access  to  the  OT.  Clearly,  these 
analyses  are  the  complement  of  capabilities  case,  and  are  being  implemented. 

3.2  Analyses  of  Authorization  List 

ADAM  also  supports  a  semantic  perspective  for  analyses  based  on  the  authorization- 
list,  where  multiple  (one  or  more)  user  roles  have  been  assigned  (and/or  prohibited) 
to  each  individual  who  accesses  an  application.  These  analyses  extend  the  URDH 
situation,  since  they  (in  most  cases)  repeatedly  call  the  “relevant”  URDH  analyses  for 
each  user  role  assigned  to  an  individual.  The  designer-initiated  analyses  provided  for 
the  authorization-list  are: 

•  the  capabilities  of  the  individual  based  on  the  assigned  and  prohibited  roles 

•  the  authorization  analysis  of  the  application  based  on  the  assigned  and  prohibited 
roles 

We  do  not  provide  bit  maps  from  ADAM  for  the  authorization  list  analyses,  since  these 
analyses  are  extensions  from  the  URDH  analyses  with  aggregated  information  from  the 
user  roles. 

3.3  Other  Analyses 

In  addition  to  the  security  related  analyses  that  have  been  described,  we  have  also 
developed  a  significant  set  of  analysis  techniques  for  non- security  object-oriented  design 
model  constructs.  For  example,  one  analysis  would  choose  OT  in  an  application  and 
perform  a  “neighborhood  search”  to  all  other  OTs  and  RTs  that  are  within  a  certain 
distance.  This  analysis  searches  for  both  inheritance  and  relationship  links  to  determine 
the  correct  neighborhood.  Another  analysis  that  is  supported  examines  the  design  of  an 
application  to  identify  cycles  that  have  been  caused  by  different  RTs  linking  OTs.  Cycle 
detection  is  an  important  step,  especially  since  an  ADAM  design  might  span  multiple 
screens  so  that  an  engineer  would  not  “see”  aU  of  the  interdependencies.  These  and 
other  analyses  have  been  detailed  elsewhere  [8]. 

4  Ongoing  Research  and  Prototyping  Issues 

This  section  considers  other  relevant  issues  which  involve  research  and  prototyping  ef¬ 
forts  that  are  in  progress  and  have  a  strong  relationship  to  security  concepts.  Specifically, 
in  the  first  part  of  this  section,  we  review  research  on  integrity  constraints.  Integrity 
constraints  represent  a  research  focus  [9]  that  has  not  been  reported  to  date.  Next,  we 
discuss  a  plausible  approach  for  realizing  and  enforcing  security  in  a  manner  that  is 
consistent  with  object-oriented  precepts  and  principles. 
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4.1  Integrity  Constraints 

The  integrity  constraint  (IC)  construct  for  ADAM  has  been  designed  to  span  both 
programming  (typing)  and  database  (derived  values)  requirements  for  integrity  in  an 
object-oriented  model.  In  our  model,  we  take  the  view  that  ICs  are  restricted  to  within 
an  OT  and  define  the  values  that  an  attribute  may  take.  This  fulfills  the  encapsulation 
characteristic  of  the  object-oriented  paradigm  and  ensures  that  all  instances  of  the  same 
OT  have  identical  behavior.  An  IC  applies  to  a  single  instance  of  an  OT,  i.e.,  a  constraint 
may  not  involve  the  private  data  items  of  two  separate  instances  even  if  the  two  instances 
are  of  the  same  OT.  While  this  is  a  very  strict  definition  of  dependent  behavior,  note 
that  the  propagation  construct  in  ADAM  [6],  which  has  not  been  discussed  herein,  is 
available  to  handle  more  complex  situations. 

Integrity  constraints  are  inherited  within  the  ISA  hierarchy,  where  all  constraints 
defined  on  the  attributes  of  an  OT’s  ancestors  are  inherited  by  the  OT  itself.  In  addition, 
an  IC  may  apply  to  the  private  data  directly  defined  on  an  OT  and/or  to  the  private 
data  that  the  OT  has  inherited  from  its  ancestor(s).  In  this  sense,  an  OT  is  composed 
of  itself  and  its  ancestors;  while  it  appears  that  a  constraint  may  cross  multiple  OTs,  in 
reality  only  one  OT  (and  its  instance  that  contains  instances  of  ancestors)  is  involved. 
To  maintain  consistency  with  Section  2,  an  integrity  constraint  profile,  ICP,  contains: 

1.  the  name  of  the  constraint 

2.  a  prose  IC- description  for  the  purpose  of  the  IC 

3.  the  constraint  variant 

4.  an  algebraic  expression  describing  the  constraint 

5.  the  attribute  profiles  for  all  attributes  involved  in  the  constraint 

6.  the  method  profiles  for  methods  impacted  by  the  IC 

There  are  two  IC  variants:  value  restriction  and  attribute  dependent.  The  value- 
restriction  variant  of  ICs  restricts  the  values  that  an  attribute  can  have  based  on  an  em¬ 
pirical  value,  e.g.,  Age  <  25.  In  the  attribute-dependent  variant  of  ICs,  the  values  that 
the  dependent  attribute  may  assume  are  restricted  based  on  one  or  more  other  attributes 
defined  on  the  same  OT  (or  its  ancestors),  e.g.,  AmtDue  =  Charge  -  Insurance. 

An  important  aspect  of  ICs  that  is  related  to  security  constraints  involves  our  im¬ 
plementation  approach  for  constraint  maintenance.  The  main  thrust  of  our  approach 
is  to  ensure  the  integrity  of  all  attributes  on  a  method-by-method  basis.  The  designer- 
defined  methods  specified  for  an  OT  are  the  unit  of  integrity  assurance,  and  operate  by 
checking  the  values  of  all  attributes  written  by  the  designer- defined  method  after  the 
execution  of  every  method.  Since  ICs  are  limited  to  a  single  OT,  and  may  not  span 
instances  of  OTs  not  related  by  inheritance,  a  more  centralized  approach  to  integrity 
maintenance  is  not  required. 

We  use  a  straightforward  example  from  our  HCA  to  demonstrate  our  approach.  Sup¬ 
pose  that  the  following  ICs  have  been  defined  on  the  Prescription  OT  for  its  Cost  and 
W_Sale_Cost  attributes:  ICl :  Cost  >  0  and  IC2:  W_Sale_Cost  <  Cost.  Assume 
that  the  UpdateCost  method  modifies  both  the  Cost  and  W.Sale.Cost  attributes  by 
the  same  amount.  The  first  step  in  constraint  maintenance  generates  a  boolean  method 
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for  every  attribute  that  is  involved  in  a  particular  IC.  This  method  is  passed  the  new 
value  for  the  involved  attribute  and  returns  true  or  false  depending  on  whether  the  IC 
has  been  violated.  Using  the  previous  example,  a  single  method  is  produced  for  ICl 
and  two  methods  are  produced  for  IC2. 

Boolean  CostCheckICl (NewCst)  {Return  (NewCst  >0);} 

Boolean  CostCheckIC2 (NewCst)  {Return  (W_Sale_Cost  <  NewCst);} 

Boolean  W_Sale_Cost(NewW_Sale_Cost)  {Return  (HewW_Sale_Cost  <  Cost);} 

The  next  step  is  to  group  together  all  of  the  boolean  methods  that  involve  the  same 
attribute  so  that  all  related  ICs  can  be  checked  with  a  single  method  caU,  to,  in  this 
case,  CheckCost.  Note  that  the  results  of  each  of  the  individual  boolean  methods  are 
logically  ANDed  together. 

Boolean  CheckCost (NewCst) 

{  Retiirn(CostCh0ckICl (NewCst)  AND  CostCheckIC2 (NewCst) ) ;  } 

These  diflerent  methods  will  be  automatically  generated  by  ADAM  for  maintaining 
constraints,  in  conjunction  with  an  appropriate  runtime  process. 

Specifically,  we  utilize  a  constraint  consistency  OT  (CCOT),  which  is  responsible 
for  maintaining  information  consistency  with  respect  to  an  OT’s  ICs  at  runtime.  Every 
designer-defined  OT  contains  a  CCOT  generated  by  ADAM  that  operates  conceptually 
in  a  fashion  similar  to  a  constructor  (i.e.,  an  instance  of  CCOT  is  created  whenever  an 
OT  is  instantiated  to  track  and  verify  the  changes  to  attributes  against  defined  OTs). 
The  constructor  for  an  OT  is  extended  so  that  when  an  instance  of  an  OT  is  instantiated, 
a  constructor  for  the  CCOT  is  called  to  automatically  create  a  CCOT  instance.  In  this 
manner,  an  object  s  constraints  are  maintained  while  the  vehicle  which  carries  out  the 
maintenance  is  shielded  from  the  user. 

The  CCOT  is  activated  on  each  method  call.  Any  changes  to  attribute  values  made 
to  by  the  method  are  stored  to  copies  of  the  attributes  located  in  the  constraint  consis¬ 
tency  object.  At  the  end  of  the  method’s  execution,  the  CCOT  calls  an  attribute  check 
method  for  aU  attributes  modified  by  the  method.  If  the  new  values  of  all  attributes 
(the  copies)  are  consistent  with  respect  to  their  ICs,  the  attributes  on  the  original  object 
are  updated  to  reflect  the  new  values,  i.e.,  copy  from  CCOT  to  the  OT.  If  any  of  these 
checks  fail,  the  effect  is  that  the  method  did  not  execute  since  the  values  for  CCOT  are 
not  copied  back  to  the  OT. 

In  our  example,  if  the  UpdateCost  method  is  called,  the  constraint  checker  first 
makes  copies  of  the  W_Sale_Cost  and  Cost  attribute  values.  Next,  the  UpdateCost 
method  is  executed,  making  changes  to  the  copies  of  the  attributes  located  on  the 
CCOT.  When  the  method  has  finished  executing,  the  CheckCost  and  CheckW_Sale_Cost 
methods  are  called  and  their  results  logically  ANDed  together.  If  the  results  of  all  of 
these  methods  are  true,  then  the  values  for  Cost  and  W_Sale_Cost  are  copied  to  the 
apropos  attribute  values  located  in  the  original  object  (instance). 

The  use  of  a  CCOT  encapsulates  the  integrity  maintenance  behavior  within  an  OT 
while  providing  separation  between  the  behavior  and  its  maintenance  mechanism,  and 
therefore  offers  another  layer  of  information  protection.  The  approach  also  resolves  the 
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undo  problem;  since  changes  are  not  made  to  actual  attributes  until  their  integrity  has 
been  validated,  there  are  no  actions  to  be  undone,  and  the  CCOT  instance  is  discarded. 

Some  may  argue  that  our  approach  wastes  time  by  delaying  the  validation  of  at¬ 
tribute  values  to  the  end  of  a  method  call.  However,  this  is  necessary  to  avoid  “tran¬ 
sient  inconsistency”,  where  a  value  is  inconsistent  for  a  brief  period  during  a  method’s 
execution.  For  example: 

1  UpdateCost (amount) 

2  i 

3  W_Sale_Cost  =  W_Sale_Cost  +  amovmt; 

4  Cost  =  Cost  +  amount; 

5  } 

suppose  that  the  original  values  for  the  attributes  are  Cost  =  4  and  W_Sale_Cost  = 
3,  and  the  amount  parameter  contains  2.  If  we  were  checking  attribute  integrity  on 
a  line-by-line  basis,  when  W_Sale_Cost  is  set  to  5  at  line  3,  the  W_Sale_Cost  <  Cost 
integrity  constraint  would  no  longer  hold.  Using  our  approach,  both  W_Sale_Cost  and 
Cost  are  checked  at  the  end  of  the  method  when  both  will  contain  consistent  values 
with  respect  to  their  ICs. 

Note  that  by  making  changes  to  the  CCOT  and  then  transferring  values  to  the 
OT,  we  are  assuming  that  problems  are  likely  to  occur  that  requiring  undoing.  This 
behavior  is  appropriate  for  applications  where  integrity  violations  are  highly  probable. 
An  alternative  approach  would  revise  the  concepts  so  that  the  CCOT  contains  the 
original  values  and  the  OT  itself  is  directly  modified.  When  an  undo  was  necessary,  the 
CCOT  values  would  be  copied  to  the  OT.  This  case  applies  well  to  applications  where 
the  ICs  are  not  likely  to  be  violated.  We  believe  that  both  approaches  are  desirable  and 
will  be  explored  as  future  research. 

4.2  Security  Realization  and  Enforcement 

In  a  recent  effort  [3],  we  advocated  that  the  characteristics  of  the  object-oriented 
paradigm  must  be  the  guiding  factor  in  the  design  and  development  of  security  capabil¬ 
ities.  This  includes:  basic  features  such  as  the  public  and  private  interfaces,  encapsula¬ 
tion,  and  hiding;  advanced  features  such  as  polymorphism,  dispatching,  and  overloading; 
and  paradigm  claims  such  as  software  reuse  and  evolution.  In  self-critiquing  our  own 
efforts,  we  asked  two  important  questions: 

•  How  can  and  should  these  three  advanced  features  be  utilized  to  realize  security? 

•  What  role  can  the  paradigm  claims  play  in  the  security  enforcement  process? 

Polymorphism,  through  its  type  independence  of  code,  might  be  the  vehicle  by  which 
security  code  for  object-oriented  systems  can  be  successfully  implemented  and  reused. 
In  a  URBS  solution  to  security,  different  roles  must  all  undergo  the  same  processes  of 
granting  privileges,  authentication,  and  enforcement.  When  establishing  a  security  pol¬ 
icy  for  an  application,  polymorphism  can  be  used  to  develop  class  libraries  for  supporting 
these  processes,  that  are  parameterized  by  type  (in  this  case,  user  role!).  Dispatching 
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and  overloading  are  strongly  linked,  and  together  allow  an  executing  piece  of  object- 
oriented  code  to  behave  differently  based  on  the  type  of  the  invoking  instance.  There  is  a 
strong  parallel  from  a  security  perspective;  dispatching  and  overloading  have  strong  ties 
to  promoting  and  supporting  the  execution  of  security  code  via  the  runtime  invocation 
of  different  methods  based  on  the  involved  user  role.  In  this  case,  the  security  policy 
and  its  associated  code  can  be  extended  and  modified  as  needed  when  user  roles  (or 
their  capabilities)  change  over  time.  The  common  theme  of  all  three  advanced  features 
is  to  consider  the  design  and  development  of  security  class  libraries,  which  are  geared 
towards  the  support  of  security  requirements  in  an  object-oriented  domain. 

The  paradigm  claims  that  appear  to  have  the  most  impact  on  security  for  object- 
oriented  systems  are  software  reuse  and  evolution.  In  practice,  these  two  claims  are 
tightly  linked  to  the  definition  and  maintenance  of  OT/class  libraries  for  object-oriented 
applications.  The  security  solution  that  utihzes  an  OT/class  Hbrary  approach  is  strongly 
tied  to  reuse,  since  once  defined,  these  hbraries  can  be  reused  as  is,  extended  with  new 
capabilities,  or  evolved  to  satisfy  changing  needs.  For  a  given  application  (like  HGA), 
apropos  security  libraries  would  be  automatically  included.  These  libraries  would  pro¬ 
vide  ail  aspects  of  security,  such  as  definition,  authentication,  and  enforcement.  For 
example,  in  HCA,  a  software  tool  to  monitor  and  establish  treatment  was  to  be  devel¬ 
oped  for  all  professionals  that  administer  care,  e.g.,  nurses,  physicians,  technicians,  etc. 
In  a  URBS  approach,  each  of  these  professionals  would  have  different  user  roles.  The 
overall  security  policy  for  such  an  application  would  need  to  consider  and  distinguish 
the  security  requirements  for  each  role.  If  such  a  policy  for  HCA  implemented  as  a  class 
library,  then  the  user  role  for  physician  would  be  given  more  expansive  access  to  the 
library  (to  allow  doctors  to  set  medication  and  treatment)  than  nurses.  When  such  a 
policy  is  included  in  the  software  tool,  the  end  result  is  that  the  tool  behaves  differently 
based  on  the  user  and  his/her  role  (dispatching  again). 

To  realize  the  aforementioned  scenario  of  a  class  library  for  security,  where  the  same 
tool  would  operate  differently  depending  on  the  user  role,  there  must  be  support  at  the 
implementation  level  in  the  definition  of  OTs/classes.  The  integrity  constraint  imple¬ 
mentation  approach  (see  Section  4.1)  can  be  exploited  to  further  expand  the  capabilities 
of  the  constructor  to  include  instances  of  the  relevant  security  classes,  to  define  the  se¬ 
curity  policy.  This  is  analogous  to  the  CCOT  and  provides  a  way  to  bridge  the  gap  from 
type-level  security  to  its  instance-level  realization.  Another  choice  would  be  to  add  a 
security  constructor  to  an  OT/class,  that  would  specifically  and  uniquely  embody  the 
security  policy.  Regardless  of  the  final  choice,  the  idea  of  a  security  class  library,  and 
its  inclusion  and  reuse  both  within  and  across  applications,  can  be  strongly  advocated 
as  consistent  with  object-oriented  precepts  and  principles. 


5  Concluding  Remarks  and  Future  Work 

This  paper  has  presented  a  report  on  the  prototyping  of  ADAM,  a  unified  environment 
for  supporting  object-oriented  design  and  analyses  which  includes  URBS  for  D  AC.  While 
the  core  concepts  and  constructs  for  our  object-oriented  design  model  (see  Sections  2.1 
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and  2.2  again)  were  presented,  our  major  emphases  were  on  the  user-role  definition 
hierarchy  (for  defining  roles  and  establishing  privileges  in  Section  2.3),  the  authorization 
list  (for  individuals  who  need  to  play  multiple  roles  in  Section  2.4),  and  the  available 
analyses  (see  Section  3)  within  ADAM.  These  analyses  are  critical  for  successful  design, 
since  they  allow  the  security  designer  to  compare  and  contrast  the  realized  design  against 
his/her  intended  security  requirements.  This  results  in  designs  which  are  more  accurate 
and  precise,  at  least  when  considered  from  a  URBS  perspective.  We  also  provided 
a  preliminary  report  on  our  research/prototyping  efforts  for  integrity  constraints  (see 
Section  4.1),  and  related  this  work  to  security  realization  and  enforcement  issues  (see 
Section  4.2).  Overall,  we  plan  on  continuing  to  develop  and  evolve  ADAM,  using  it  as 
a  test-bed  to  explore  and  verify  our  different  research  ideas  related  to  both  structural 
and  URBS  design. 

A  number  of  projects  related  to  ADAM  have  been  identified  and  are  ongoing: 

•  The  Impact  of  Changes  Across  the  Environment:  In  this  case,  we  are  interested  in 
what  happens  when  a  significant  change  to  the  application  occurs.  For  example, 
if  a  user  role  is  deleted,  the  impact  on  the  authorization  list  must  be  considered. 
Likewise,  if  methods  or  object  types  are  deleted,  then  the  URDH  and  the  autho¬ 
rization  list  may  be  affected.  The  issue  is  the  degree  to  which  these  changes  can 
be  automated  within  ADAM. 

•  Enforcement  of  Roles  and  Authorizations:  Throughout  the  entire  environment, 
software  engineers  involved  in  cooperative  design  or  development  on  an  application 
should  be  only  able  to  see,  use,  and/or  modify  the  methods  and  object  types  that 
have  been  authorized  to  them  based  on  their  roles.  This  must  be  enforced  in  all 
relevant  portions  of  ADAM. 

•  Automatic  Documentation  Generation:  Originally,  the  different  portions  of  pro¬ 
files  (see  Section  2  again)  are  utilized  to  create  comments  in  the  generated  code. 
This  capability  hass  been  extended  to  automatically  create  Latex  documentation 
for  a  particular  design. 

Our  overall  goal  is  to  have  a  unified  environment  that  supports  all  aspects  of  software 
design,  development,  and  implementation,  with  security  as  a  critical  component. 
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Abstract 

Relational  database  systems  are  based  on  a  powerful  abstraction:  the  relational 
data  model  with  the  relational  algebra  and  update  semantics.  If  the  database  de¬ 
sign  (i.  e..  the  way  the  data  is  organized)  satisfies  criteria  provided  by  this  foun¬ 
dation,  users  have  assurance  that  they  can  retrieve  information  in  a  consistent, 
predictable  way.  Multilevel  secure  database  systems  must  not  only  provide  assur¬ 
ance  that  information  is  protected  based  on  its  sensitivity,  but  should  be  based 
on  a  data  model  as  sound  and  complete  as  the  conventional  relational  model. 

In  this  paper,  we  present  a  data  model  with  a  relational  algebra  and  update 
semantics  for  a  multilevel  secure  database  system  whose  protection  mechanisms 
are  provided  by  the  replicated  architecture.  The  approach  is  to  systematically 
describe  the  effects  of  treating  security  labels  as  data  and  to  define  explicitly  the 
semantics  of  these  data  labels  for  relational  database  operations.  We  also  briefly 
compare  the  SINTRA  data  model  to  earlier  ones  from  the  SeaView  project  and 
their  derivations. 
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1  Introduction 


Like  all  other  database  systems,  multilevel  secure  database  management  systems  (DBMS) 
are  based  upon  a  data  model.  Data  models:  originated  as  a  way  of  describing  the  struc¬ 
ture  of  data  as  used  in  the  actual  file  systems  of  database  management  systems.  Over 
time,  they  have  evolved  to  modeling  data  from  the  point  of  view  of  the  users  and  of  the 
applications  of  the  database  management  system. 

We  take  the  current  view  of  data  model  for  a  database  system,  i.e.,  as  a  set  of 
concepts  that  can  be  used  to  describe  the  structure  of  and  the  operations  on  a  database 
[Nav92].  The  database  structure  includes  the  data  types,  relationships  and  constraints 
on  the  form,  or  “template,”  of  the  database.  The  database  operations  include  the  ways 
in  which  the  data  may  be  manipulated  via  retrievals  and  updates.  The  operations  ought 
to  provide  specifically  for  insertion,  deletion,  and  modification  of  the  data. 

The  relational  data  model  [Cod70]  is  a  good  example  of  a  data  model  from  this 
perspective.  The  structure  of  the  relational  data  model  is  well  known,  i.e.,  attributes, 
relational  schema,  etc.  The  operations  of  the  relational  data  model  are  provided,  in 
part,  by  the  relational  algebra  [U1182].  SQL  completes  the  operations  portion  of  the 
data  model  by  incorporating  the  power  of  the  relational  algebra  into  a  query  language 
which  also  permits  insertion,  deletion  and  modification  of  data.  It  is  from  this  point 
of  view  that  we  have  approached  development  of  a  data  model  for  a  multilevel  secure 
relational  database  based  on  a  replicated  architecture. 

In  the  world  of  multilevel  secure  relational  database  systems,  early  data  models, 
which  were  for  the  TCB  subset  architecture,  tended  to  focus  on  the  definition  of  the 
structure  rather  than  the  operations  [Den87,  Lun90].  Later  work  did  consider  some 
operations  [.JaS91],  but  still  relied  on  a  similar  set  of  underlying  constraints  to  define 
the  structure  of  the  data  model.  Most  of  these  constraints  derive  from  the  work  done  in 
the  development  of  the  SeaView  multilevel  secure  relational  database  system  [Lim90]. 

In  the  course  of  re-examining  data  models  while  developing  the  prototype  for  SIN¬ 
TRA  (a  high  assurance  multilevel  secure  relational  database  based  on  a  replicated  ar¬ 
chitecture),  we  discovered  that  some  of  the  constraints  of  these  early  data  models  were 
not  necessary,  in  our  opinion,  from  database  functionality  or  security  points  of  view,  but 
rather  seemed  to  stem  from  the  way  in  which  the  architectures  separate  data  to  provide 
mandatory  access  control. 

We  have  developed  a  new  data  model  specifically  for  the  SINTRA  prototype.  Like 
the  earlier  data  models  developed  for  the  TCB  subset  architecture,  the  multilevel  data 
model  is  specified  in  terms  of  the  conventional  relational  data  model  (but  for  a  different 
reason).  In  this  paper  we  will  describe  the  structure  of  the  SINTRA  data  model  and  the 
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constraints  on  it.  In  some  cases  the  constraints  are  the  same  as  those  of  the  TCB  subset 
data  models.  In  these  cases  we  will  compare  and  discuss  the  potential  for  enforcing 
the  constraint.  In  other  cases,  constraints  placed  upon  the  TCB  subset  data  model  can 
be  weakened  considerably  or,  in  some  cases,  eliminated  entirely,  thereby  improving  the 
functionality  of  database  systems  built  upon  the  corresponding  data  model.  In  these 
cases,  we  demonstrate  the  desirability  of  the  improvement  by  example. 

This  paper  is  organized  as  follows.  First,  we  present  our  criteria  for  a  mmltilevel 
secure  relational  data  model  and  sketches  of  both  the  TCB  subset  and  replicated  ar¬ 
chitectures  for  multilevel  secure  databases  systems.  We  then  examine  the  structure 
of  the  data  model,  comparing  the  constraints  imposed  for  each  choice  of  architecture 
as  described  above.  Finally,  we  define  the  semantics  of  the  insert,  update,  and  delete 
operations  for  the  SINTRA  data  model  and  describe  how  they  are  done. 


2  Criteria  for  A  Multilevel  Secure  Relational  Data 
Model 

Multilevel  secure  DBMSs  are  a  relatively  new  concept,  and  very  few  products  are  in  the 
process  of  evaluation.  Most  potential  users  of  multilevel  secure  DBMS  are  accustomed 
to  relying  on  system-high  databases.  (System-high  databases  are  regnlar  untrusted 
DBMSs  in  which  only  users  whose  clearances  dominates  “system  high”  can  access  the 
databases.)  In  such  DBMSs,  all  data  that  users  can  legitimately  view  in  the  system 
is  available.  The  data  are  not  security-labeled  but  all  output  from  such  a  system  is 
classified  "system  high”  even  though  some  data  in  it  are  in  reality  lower  security  level 
data.  The  security  of  such  systems  is  assured  by  clearing  all  users  to  system-high. 

VV  Ik'ii  users  make  the  transition  to  multilevel  secure  relational  database  systems, 
they  will  liring  with  them  many  expectations  about  what  these  systems  will  be  able 
to  do  based  on  their  experience  with  system-high  DBMSs.  In  developing  data  models 
tor  multilevel  secure  relational  database  .systems,  these  expectations  should  be  accom¬ 
modated  without  compromising  security  lo  t  he  extent  possible.  Alternatively,  from  the 
user  s  jroinl.  o(  view,  a  multilevel  secure  DBMS  should  allow  secure  access  to  all  infor¬ 
mation  that  users  need  to  sec  and  should  jirovide  operational  capabilities  equivalent  to 
those  of  a  system-high  DBM,S.  This  means  that  (1)  users  should  be  able  to  represent 
and  associations  among  these  entities,  and  (2)  user  should  be  able  to  store  the 
results  of  operations  in  relations  (i.e.,  the  relational  algebra  is  closed  for  these  opera¬ 
tions).  It  also  implies  that  multilevel  equivalents  of  relation  schemas,  relational  algebra, 
and  perhaps  even  SQL  should  be  formula, ted  with  as  few  restrictions  as  possible. 
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The  data  model  that  is  presented  here  attempts  to  provide  as  many  capabilities  as  the 
current  system-high  databases  have.  For  example,  previous  data  model  investigations 
[JaS91,  Den87]  do  not  address  the  problem  of  inserting  the  result  of  a  join  operation  into 
another  relation  (i.e.,  INSERT  INTO  ...  )  and  how  the  new  tuples  should  be  classified. 
The  integrity  constraints,  relational  algebra,  and  update  semantics  in  this  paper  are 
influenced  by  the  need  that  users  have  for  these  database  operations. 

This  work  also  differs  from  previous  multilevel  secure  data  model  investigations 
[.laS9T  Den87]  in  its  treatment  of  the  conventional  data  model  [U1182]  as  a  special 
case  of  a  multilevel  secure  data  model.  For  example,  if  there  is  only  one  security  level 
(i.e..  system-high  database),  then  this  data  model  behaves  in  the  same  way  as  the  con¬ 
ventional  data  model  if  security  labels  are  ignored  (i.e.,  All  multilevel  relational  algebra 
and  update  operations  behave  exactly  the  same  as  the  conventional  relational  algebra 
and  update  operations). 


3  The  Replicated  and  TCB  Subset  Architectures 
for  Multilevel  Secure  Databases 


This  section  presents  only  a  brief  descriptions  of  both  the  replicated  (SINTRA)  and  TCB 
subset  architectures.  Detailed  description  of  these  are  readily  available  elsewhere  and  are 
too  lengthy  to  reproduce  here;  the  purpose  of  this  section  therefore  is  to  remind  those 
already  familiar  with  those  architectures  and  direct  others  to  appropriate  references. 
Readers  familiar  with  these  architectures  and  basic,  data  model  can  skip  this  section. 

3.1  TCB  Subset  Architecture 

There  are  two  variants  of  the  basic  TCB  subset  architecture.  TCB  subset  architectures 
rely  on  a  trusted  computing  base  (TCB)  to  separate  data  at  various  security  levels.  This 
approach  requires  that  multilevel  objects,  relations,  and  attributes  (.sometimes)  in  the 
case  of  multilevel  secure  relational  databases,  be  decomposed  into  single-level  entities 
which  can  be  protected  by  the  TCB.  The  user’s  view  of  the  multilevel  object  must  be 
recovered  from  the  single-level  entities  as  needed  via  database  views. 

In  the  TCB  subset  architecture,  the  DBMS  runs  under  the  control  of  a  trusted 
operating  system.  Each  user  of  the  system  has  a  distinct  multilevel  view  of  a  relation 
where  the  classification  of  the  view  is  dominated  by  the  user’s  clearance.  The  DBMS 
is  untrusted  and  operates  at  the  user’s  login  level;  the  instance  of  the  DBMS  running 
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on  the  user’s  behalf  can  retrieve  data  at  that  level  and  below  via  the  trusted  operating 
system. 

3.1.1  Vertical  TCB  Subset  Architecture  (VTSA) 

This  is  the  original  form  of  the  TCB  subset  architecture  and  was  originally  produced  by 
the  SeaView  project  [Lun90].  The  decomposition-recovery  procedures  have  undergone 
several  iterations.  Basically  the  strategy  is  to  embed  multilevel  relations  into  the  con¬ 
ventional  relational  structure  by  creating  additional  classification  attributes  to  carry  the 
security  label  of  the  “real”  attribute  values  (as  do  all  the  architectures  we  consider  in 
this  paper).  Multilevel  relations  are  then  decomposed  vertically  (and  horizontally)  into 
single-level  relations  using  the  “real”  key  values  and  the  classification  attributes.  The 
conxentional  join  operation  is  the  primary  means  of  recovering  the  multilevel  relation 
from  the  single-level  fragments.  Details  can  be  found  in  [Den87,  JaS90]. 

3.1.2  Horizontal  TCB  Subset  Architecture  (HTSA) 

This  v^ariant  of  the  TCB  subset  architecture  is  similar  to  the  previous  ones  except  that 
the  decomposition-recovery  scheme  replaces  the  join  operation  of  the  recovery  process 
with  the  union  operation.  This  is  accomplished  by  horizontally  decomposing  the  multi¬ 
level  relations  by  security  level  and  entering  special  markers  in  place  of  lower  level  data. 
The  complete  description  of  this  model  is  presented  in  its  entirety  [JaS91]. 


3.2  Replicated  (SINTRA)  Architecture 

The  SINTRA  architecture  relies  on  physical  separation  of  data  by  security  level  and 
replication  of  data  across  security  levels  to  provide  high  assurance  protection  of  infor¬ 
mation  [KF(!92.  Kan94]. 

1  fit'  replicated  (SINTR  A)  architecture  has  a  trusted  frontend  and  an  untrusted  back¬ 
end  datal)ase  systems  for  each  security  h'vel.  Each  backend  DBMS  contains  information 
at  a  given  security  class  together  with  replicated  information  from  each  lower  backend 
database.  Hence  users  have  access  to  a  single  DBMS  containing  all  and  only  the  in¬ 
formation  they  are  cleared  to  see.  For  example,  secret  users  have  access  to  the  secret 
backend  which  contains  both  secret,  confidential,  and  unclassified  data,  and  confiden¬ 
tial  users  have  access  to  the  confidential  backend  with  only  confidential  and  unclassified 
data.  Because  data  is  retrieved  from  only  one  backend,  queries  cannot  be  used  by  Tro¬ 
jan  horse  code  in  user  applications  to  leak  information  to  malicious  processes  on  a  lower 
security  level  system. 
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4  The  Data  Model  —  Structure 

Multilevel  secure  DBMSs  have  been  generally  defined  as  follows.  The  protected  objects 
of  the  secure  system  are  data  items  and  the  subjects  of  the  secure  system  are  operations 
or  sequences  of  operations  on  the  data  items.  Each  subject  or  object  has  a  security  label 
from  a  security  lattice  and  the  mandatory  access  control  policy  (or  security  policy)  is  a 
variant  of  the  standard  Bell-LaPadula  policy  [BeL76]:  Subjects  may  read  objects  at  or 
below  their  own  security  label  but  write  objects  only  at  their  own  security  level. 

In  this  paper,  since  we  are  interested  in  relational  systems,  the  data  items  are  taken 
to  be  values  of  relational  attributes.  This  is  commonly  called  element  level  labeling 
(as  opposed  to  tuple  level  labeling,  which  is  not  addressed  here).  In  the  remainder  of 
this  section,  we  discuss  the  structure  and  constraints  of  the  SINTRA  data  model  and 
compare/contrast  it  with  the  TCB  subset  architecture  data  models,  both  VTSA  and 
HTSA 


4.1  Relational  Schema 

Given  a  conventional  relational  schema 
R(A-[,  A-i,  ■  .  An) 

the  corresponding  SINTRA  multilevel  relation  scheme  is  denoted  by 

R(A,,  a,  d.,,  Q,  .  .  .,  Cn,  TL) 

where  each  .4,  is  a  data  attribute  over  domain  Dj,  each  G,-  is  a  classification  attribute 
for  Ai  and  TL  is  the  tuple-level  attribute.  R(A\,  A2,  ■  ■  ■,  An)  is  the  underlying  relation 
as  viewed  by  the  user. 

Notice  that  the  security  label  of  a  given  attribute  value  is  the  value  of  another 
related  attribute.  Thus  the  security  labels  are  stored  as  data  in  the  conventional  sense. 
There  is  nothing  in  data  model  theory  that  require  that  the  security  labels  of  data 
items  be  themselves  relational  data,  but  only  that  a  security  label  be  associated.  The 
reason  that  both  SINTRA  and  the  T(3  subset  architectures  take  this  approach  is  that 
both  anticipated  exploiting  conventional  relational  DBMSs  to  implement  their  systems 
by  embedding  them  in  the  conventional  relational  model.  In  the  SINTRA  case,  the 
conventional  systems  are  used,  unaltered  internally,  as  the  backends,  while  in  the  TCB 
subset  case,  the  conventional  systems  store  relational  data  in  files  whose  separation  is 
assured  by  the  TCB. 

Given  a  tuple  t  in  relation  R,  t[Xi]  denotes  the  value  of  attribute  Xj  in  relation  R. 
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We  use  R.  ambiguously  for  both  R(Ai,  A2,  .  .  A„J  and  R{Ai,  Q,  A2,  C2,  .  .  A^, 

Cn,  TL)  with  context  being  the  arbiter. 

In  the  SINTRA  schema,  the  TL  attribute  represents  the  security  level  at  which 
the  tuple  originated  whereas  in  the  TCB  subset  model,  the  tuple  class  is  simply  the 
maximum  security  level  of  all  attributes.  Notice  that  t[TL]m  the  SINTRA  data  model 
is  not  just  the  least  upper  bound  of  all  t[Ci]  because  complex  operations  performed 
at  a  high  security  level  may  generate  tuples  with  all  attribute  value  labels  lower  than 
the  t[TL].  How  this  can  occur  will  be  discussed  in  section  5.2.1.,  which  describes  the 
operations  of  the  SINTRA  model. 


4.2  Entity  Integrity  Constraint 

Let  ApK  be  the  primary  key  of  relation  R(Au  A2,  .  .  .,  AJ.  A  multilevel  relation  R 
sati.sfieR  entity  integrity  if  and  only  if  for  all  tuples  f  in  relation  R 


•  .4,  G  App  t[Ai]  ^  null  A  t[Q]  ^  null,  and 

•  t[TL]  >  t[Ci]  for  any  i. 

Condition  (1)  is  similar  to  the  definition  of  entity  integrity  for  the  untrusted  relational 
databases.  Condition  (2)  requires  that  if  the  tuple  is  generated  or  modified  by  a  t[TL]- 
user  then  any  element  classification  within  the  same  tuple  should  be  equal  to  or  lower 
than  t[TLj. 

The  rCB  subset  models  have  a  more  restrictive  constraint.  In  particular  they  require 
that  the  primary  key  value  be  uniformly  classified.  That  is, 

4,.  4,  €  4 PA-  — >  tfCi]  =  tfCj]. 

riiis  restriction  does  reduce  functionality.  (Consider  this  highly  simplified  example 
of  a  relat  ional  system; 


EMP(s.s#,  name,  address) 

MISSION(mission#,  description,  skillsmeeded) 
1'.; AI P-  M 1 S  S 1 0  N  ( ss  ^ ,  m  i  ssi  on  7^ ,  location) 


The  EMP-MISSION  relation  represents  a  niany-to-many  association.  It  is  possible  that 
ss^  has  a  value  which  is  classified  at  a  low  level  while  mission^  and  location  are  classified 
higher.  But  the  primary  key  of  the  EMP-MISSION  relation  is  {ss#,  mission#},  and  the 
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TCIB  subset  data  model  would  prohibit  this  relation,  because  the  TCB  subset  models 
enforce  this  restriction.  In  the  SINTRA  architecture,  this  restriction  is  unnecessary.  This 
is  important  because  relations  represent  both  entities  and  associations,  and  associations 
can  be  more  sensitive  than  the  entities  whose  relationship  is  defined. 

4.3  Null  Integrity  Constraint 

We  mention  this  only  for  completeness  as  it  is  unnecessary  for  the  SINTRA  data  model. 
Some  versions  of  the  TCB  subset  model  enforce  constraints  on  nulls;  namely  that  nulls 
are  classified  at  the  level  of  the  key  (which  is  uniformly  classified). 

In  the  SINTRA  model,  classification  of  nulls  is  determined  as  for  other  data  by  the 
operations  used  to  enter  or  modify  data. 


4.4  Polyinstantiation  Integrity  Constraint  (PI) 

For  the  SINTRA  model,  let  ApK  be  the  primary  key  of  R(A-i,  A2,  .  .  .,  An)  and  let  Cpx 
be  the  corresponding  classification  attributes  in  R(A-[,  C\,  A2,  C2,  ■  ■  ■,  An,  Cn,  TL).  A 
multilevel  relation  R.  satisfies  the  polyinstantiation  integrity  property  if  and  only  if  for 
every  i,  1  <  i  <  n,  the  functional  dependency 

ApK-  CpK,  Q,  TL  At 

holds.  The  user  specified  primary  key  App-  in  conjunction  with  the  classification  at¬ 
tributes  CpK,  Ct  and  TL  uniquely  determine  the  values  of  attribute  A;.  This  constraint 
limits  polyinstantiation  within  a  single  security  class.  That  is,  given  values  for  the 
primary  key  and  for  all  the  classification  attributes,  the  tuple  is  determined  uniquely. 

In  the  SINTRA  model,  the  polyinstantiation  constraint  is  imposed  as  an  extension  of 
entity  integrity,  treating  R(A\,  Ci,  A2.  C2,  ■  ■  •.  A„ ,  Cn,  TLj  as  a  conventional  relation 
whose  primary  key  \fi  Apj^ ,  C^ ,  .  ...  C„,  TL. 

PI  is  easiest  to  enforce  in  the  VTSA  model  because  only  a  single  instance  of  a 
pair  of  values  tJA,,  CiJ  is  kept  for  attribute  A,  not  in  App-.  HTSA  requires  using  a 
decomposition-recovery  technique  which  is  subject  to  ambiguous  interpretation^  The 
SINTRA  model  has  slightly  more  difficulty  enforcing  the  constraint.  Consider  the  stan¬ 
dard  example  [.JaS91]  of  the  relation. 

SOD(ship,  objective,  destination) 

'The  ?-replacement  rule  of  the  recovery  algorithm  does  not  specify  which  “replacement”  to  choose 
if  several  are  available.  Unless  this  is  very  carefully  done,  the  original  relation  is  not  recovered. 
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and  the  following  sequence  of  actions 


Operation  level 

Ship 

Objective 

Destination 

TL 

1 .  Insert  U 

Ent  U 

Explore  U 

Tolas  U 

U 

2.  Update  C 

Ent  U 

Mine  C 

Tolas  U 

c 

3.  Update  U 

Ent  U 

Explore  U 

Sirus  U 

u 

4.  Update  S 

Ent  U 

Spy  S 

Tolas  U 

s 

Ent  U 

Spy  S 

Sirus  U 

s 

deleted  after  action  3 


where 

-  action  2  polyinstantiates  the  original  tuple 

-  action  8  updates  (and  replaces)  the  original  tuple 

-  action  4  updates  the  existing  tuples  by  entering  a  secret  objective. 

1  he  consequence  of  this  sequence  of  actions  produces  a  pair  of  S-tuples  which  do  not 
satisfy  the  PI  constraint. 


The  resolution  of  this  difficulty  in  the  SINTRA  model  can  be  handled  by  requiring 
that  the  update  of  the  Destination  attribute  by  action  3  be  propagated  to  the  C-level 
tuples  as  well,  so  that  the  U-labeled  Destination  is  always  Sirus.  Thus  the  S-tuple 
whose  Destination  is  Tolas  would  not  appear.  In  general,  an  attribute  value  update 
must  be  propagated  to  all  higher  level  tuples  which  were  derived  from  the  updated  tuple 
by  polyinstantiation. 


5  The  Data  Model  —  Operations 

rile  lelational  algebra  and  modification  actions  for  conventional  databases  are  well  es¬ 
tablished.  but  for  multilevel  secure  databases  the.se  operations  are  not  well  defined. 
Indeed,  diifeient  data  models  tnay  have  different  algebras.  In  this  section,  we  define 
multilevel  relational  algebra  and  modification  operations  for  the  SINTRA  data  model  in 
terms  ot  the  relational  algebra  for  conventional  dataiiases  consistent  with  our  embedding 
the  multilevel  relations  in  the  conventional  relational  structure. 

Befoie  doing  this  we  establish  some  notation.  Cliven  a  security  level  c,  the  c-user’s 
view  of  a  multilevel  secure  database  consists  of  c-level  information  and  other  information 
from  levels  strictly  dominated  by  c.  Consider,  for  the  time  being,  a  DBMS  system  with 
two  security  classes,  high  (H)  and  low  (L),  where  H  dominates  L.  Let  L~user  denote  a 
user  session  level  is  L.  Let  R,  denote  the  portion  of  a  relation  R  that  is  generated  by 
c-users  for  c  =  L  or  H.  The  relation  R  in  the  low  untrusted  backend  database  will  have 
only  and  all  tuples  in  Ri  have  tuple-level  L.  However,  the  relation  R  in  the  high 
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untrusted  backend  database  contains  Rl  U  Rh- 

For  the  TCB  subset  architectures,  definitions  of  relational  operations  have  not  been 
completed.  For  VTSA,  the  SeaView  final  technical  report  [Sho89]  specifies  a  multilevel 
SQL  (MSQL)  subsystem  and  gives  the  syntax  of  select,  update,  insert  and  delete 
statements  but  does  not  specify  the  actual  effect  of  these  on  the  underlying  multilevel 
relations.  For  HTSA,  update,  insert  and  delete  are  defined  in  [JaS91].  We  will 
specif}/  the  equivalents  of  these  and  extend  them  to  more  complex  operations. 


5.1  A  Multilevel  Secure  Relational  Algebra 

As  in  the  conventional  relational  algebra,  there  are  relations  and  operations  in  the 
multilevel  relational  algebra.  The  relations  represent  both  entities  and  associations. 
Operations  take  as  arguments  one  or  two  relations  and  produce  another  relation.  We 
define  a  very  simple  relational  algebra  which  is  sufficient  to  give  the  general  idea  of  how 
it  can  be  extended.  A  more  extensive  relational  algebra  and  multilevel  SQL  for  the 
SINTRA  model  are  the  subject  of  current  research. 

We  can  express  the  multilevel  relational  algebra  between  two  multilevel  relations  R 
and  S  in  terms  of  the  conventional  (single-level)  relational  algebra  among  R^,  R^j,  Sl  and 
Sh-  Relational  algebra  among  Ri,  Si,  Rh,  and  Sh  is  exactly  the  same  as  conventional 
relational  algebra  [U1182]  unless  otherwise  stated.  We  use  the  same  operator  notation 
for  both  multilevel  relational  algebra  and  the  conventional  relational  algebra,  since  it  is 
clear  from  the  context  which  is  intended. 

Multilevel  relational  algebra  between  two  multilevel  relations  R  and  S  in  the  high 
backend  ran  be  defined  as: 

•  Sele(  t  ((t)  and  project  (0)  operations  act  exactly  the  same  as  those  in  conventional 
relational  algebra 

•  R  u  S  =  (R/^  U  Rh)  U  {Si  U  Sh) 

•  R  -  S  =  (Ri  -  Sl)  U  (Rh  -  Sh) 

•  R  X  S  =  (Rh  X  Sh)  U  (Rh  x  S/J  U  (Rh  x  S„)  U  (Rh  x  Sh) 

where  the  operators  on  the  left  side  of  the  definitions  are  what  are  being  defined  and 
those  on  the  right  hand  side  are  those  of  the  conventional  relational  algebra. 

The  select  and  project  operations  may  include  the  security  label  attributes  in  their 
parameter  sets.  The  cross  product  operation  is  particularly  significant  because  it  is  es¬ 
sential  in  retrieving  related  data  from  multiple  relations.  The  join  operations,  including 
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eqinjoin,  outer  join,  etc.,  depend  directly  upon  the  cross  product.  These  join  opera¬ 
tions  can  be  defined  using  the  cross  product,  select,  and  project  operations  as  usual.  It 
is  important  to  note  that  without  the  cross  product,  the  operators  which  one  can  use 
would  be  limited  to  operations  on  single  relations,  rendering  the  relational  system  no 
more  powerful  than  a  flat  file  database. 

Given  a  relational  algebra,  an  SQL-like  retrieval  language  can  be  defined.  It  is  also 
evident  that  this  could  be  readily  extended  to  an  arbitrary  security  lattice. 

However,  we  believe  this  relational  algebra  is  overly  simplistic  as  it  places  the  respon¬ 
sibility  for  the  manipulation  of  data  with  respect  to  security  labels  completely  on  the 
user.  It  is  our  intention  to  extend  the  definition  of  the  relational  algebra  primitives  so 
that  much  of  the  security  label  specification  in  queries  can  be  eliminated.  In  particular, 
variants  of  the  usual  database  operations  can  be  defined  depending  upon  whose  view 
of  the  data  a  user  wishes  to  see.  We  will  not  pursue  this  in  detail  in  this  paper,  giving 
only  the  following  example  of  two  variants  of  the  cross  product  operator  for  a  specific 
security  lattice. 

Gonsider  the  following  security  lattice. 

H 


Ml  M2 


L 


H>M1  >L 
H  >  M2  >  L 


One  cross  product  (complete  cross  product)  at  the  high  (H)  security  class  can  be 
expressed  as: 

R  X  S  =  U  -C  R«  X  S,.  I  a,b  G  {L,  Ml,  M2,  H}  }  = 

{Rh  X  S^)  U  [R-h  X  Sa/i)  U  {R-h  X  8/^2)  U  {Rh  x  Sl)  U 

(Ra'/i  X  Sm)  U  (R-mi  X  Sa^)  U  (R-Ari  x  Sa^2)  U  (Ra/i  x  S^)  U 

(Rm2  X  8//)  U  (Ra^2  X  8a^i)  U  (RAr2  x  8Ar2)  U  (Ryw2  x  Si)  U 

(R-l  X  8//)  U  (Rl  X  Sjiii)  U  (Rt  X  8Af2)  U  (Rl  x  8l) 

Each  tuple  will  have  t[TL]  —  //because  an  //-level  user  performs  the  cross  product.  For 
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this  operation,  the  result  is  the  product  of  all  data  that  the  H-level  user  is  entitled  to 
see. 


Another  cross  product  (dominance  cross  product)  can  be  defined  in  the  following 
way: 

R  X  S  =  U  {  Ra  X  85  I  a  >  b  or  a  <  b,  and  a,b  G  {L,  Ml,  M2,  H}  } 

The  dominance  cross  product  allows  the  H-user  to  see  the  data  as  it  would  be  seen  by 
Ml -users,  M2-users,  and  L-users,  but  nothing  more.  Thus  (Rmi  x  Sm2)  and  (Rm2  x 
Smi)  are  not  included  in  this  case  because  there  is  no  dominance  relationship  between 
the  two  security  classes.  Ml  and  M2,  and  so  neither  would  be  visible  at  either  level  Ml 
or  M2. 


5.2  Data  Modification  Operations 

In  this  section,  we  will  present  the  SINTRA  equivalents  of  the  insert,  update,  and 
delete  operations.  We  use  replace  rather  than  “update”  to  avoid  confusion  with  the 
update  operations  which  are  used  to  maintain  the  consistency  of  replicated  data.  Even 
though  we  will  use  SQL-like  syntax  to  describe  the  form  of  these  operations,  we  will  use 
retrieve  rather  than  “select”  for  a  similar  reason.  The  insert  and  replace  operations 
are  defined  in  both  simple  and  complex  forms. 

A  c-user’s  view  of  multilevel  secure  database  consists  of  c-level  information  and 
other  information  from  levels  strictly  dominated  by  c.  Therefore,  we  concentrate  our 
discussion  on  a  single  user  level  (e.g.,  c-level)  . 


5.2.1  Insert 

The  insert  operation  allows  users  to  add  new  tuples  to  an  existing  relation.  A  c-user 
cannot  insert  values  in  any  classification  attributes  T;  or  tuple-level  TL.  These  are  all 
implicitly  given  the  value  c,  which  is  the  user’s  login  level,  by  the  system. 

There  are  two  types  of  insert  operations:  (1)  simple  insert  and  (2)  complex  insert. 
We  discuss  the  interpretation  of  each  operation  in  the  following. 


Simple  Insert 

The  simple  insert  query  executed  by  a  c-user,  where  the  access  class  c  is  implicitly 
determined  by  the  user’s  login  class,  has  the  following  general  form: 
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insert  into  R  [(Ai  [,  A2,  ...])] 
values  (ai  [,  a2,  .  .  .] ) 

In  this  notation,  the  brackets  denote  optional  items  and  the  ‘h..”  signifies  repetition.  If 
the  list  of  attributes  is  omitted,  it  is  assumed  that  all  the  data  attributes  in  relation  R 
are  specified. 

Let  t  be  the  tuple  to  be  inserted  such  that  if  A,  is  in  the  attribute  list  of  insert 
queries,  then  t[A,]  =  ai  and  t[Q]  =  c;  otherwise  t[Ai]  =  null  and  t[Q]  =  c.  The  insertion 
is  permitted  if  and  only  if; 

•  l[ApA']  contains  no  nulls  (as  is  necessciry  to  enforce  entity  integrity  constraint). 

•  For  all  7/  in  relation  R,  u[Apk  ,  Cd,  .  .  .,  a,  TL]  ^  t[ApK  ,  Q,  ■  ■  Cn,  TLJ. 

Otlierwise  it  is  rejected,  because  it  would  violate  polyinstantiation  integrity.  In  other 
word.s.  a  c-iiser  can  insert  a  tuple  t  in  relation  R  A  R,  does  not  already  have  a  tuple 
with  the  same  primary  key,  attribute  classifications,  and  tuple  level  classification.  Each 
element-class  and  tuple-level  of  a  new  tuple  are  set  to  c. 


Complex  Insert 

1  he  comidex  insert  queries  executed  by  a  c-user  have  the  following  general  form: 

insert  into  R  [(A,  [,  A2,  ...])] 
retrieve  ui  [,  (T2,  .  .  .] 
from  T,  [,  T2,  . . .] 

[where  co7;d] 

where  T,  denotes  relation  name,  ai,  signifies  any  attribute  of  a  relation  in  the  from  clause 
01  expression,  and  cond  niay  contain  any  relational  algebraic  Boolean  expression  using 
attributes  of  tables  (including  security  level  attributes)  in  a  from  clause. 

The  insertion  is  permitted  and  the  tuple-level  classification  of  tuples  to  be  inserted 
will  be  set  to  c  if  and  only  if: 


•  For  all  tt  in  relation  R,  u[Apk  ,  C,.  .  .  Q,,  TL]  ^  t[ApK  ,  C,,  .  .  Cn,  TL], 

where  f  is  a  new  tuple  to  be  inserted  (i.e.,  polyinstantiation  integrity  is  preserved). 


14 


For  example,  if  the  security  lattice  is  {H,  L}  and  an  H-user  wants  a  cross  product 
operation  between  two  relations  T  and  S,  to  be  inserted  into  relation  R,  then  (T  x  S) 
—  (T/r  X  S// )  U  (T//  X  Sl)  U  (Ti,  x  Sh)  U  (Tl  x  Sl)  will  be  inserted  into  relation  R 
provided  that  newly  generated  tuples  do  not  violate  the  above  condition.  The  tuple  level 
classification  of  newly  inserted  tuples  will  be  H  because  a  H-nser  creates  those  tuples. 

Notice  that  this  can  create  tuples  t  where  the  tuple  level  strictly  dominates  the 
security  label  of  every  attribute.  In  particular,  every  tuple  in  Ti  x  Sl  will  have  the 
security  label  of  each  attribute  set  to  L  but  the  tuple  levels  will  be  H.  This  represents  a 
real  world  situation  where  the  classification  of  individual  information  may  be  lower  than 
that  of  the  association  of  the  information.  An  example  of  this  case  is  give  in  [Lun89], 
where  EMPLOYEE(EMP#,  NAME,  DEPT,  ...  )  and  PROJECT(PROJ#,  BUDGET,’ 
...  )  are  confidential  relations  while  WORK-ON(EMP#,  PROJ#)  may  be  a  secret 
relation. 


5.2.2  Replace 

The  replace  operation  allows  users  to  change  data  attribute  values  in  existing  tuples, 
but  the  replicated  architecture  system  does  not  allow  a  user  to  modify  values  of  the 
classification  attributes  Q  or  the  tuple-level  attribute  TL.  These  are  controlled  by  the 
mechanisms  of  the  replace  operation. 

The  replace  queries  executed  by  a  c-user  have  the  following  two  general  forms: 
Simple  Replace 

replace  R 

set  A,  =  s,  [,  Aj  =  s^  ,  .  .  .] 

[where  cond  ] 


Complex  Replace 
replace  R 

set  ki  =  query,  [,  A^  =  query,,  .  .  .] 

[where  cond  ] 

where  s,  is  an  expression,  query,  is  a  retrieve  query  that  must  return  exactly  one  value 
in  the  domain  of  A,,  and  cond  is  a  Boolean  expression  which  identifies  those  tuples  in  R 
that  are  to  be  modified. 

Let  R  =  (J  {  R^,  I  c'  strictly  dominated  by  c  }  and  define  two  sets,  S  and  S': 
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S  =  {t  G  Rc  I  t  satisfies  the  cond  in  where  clause  } 

S'  =  {t  G  R'o  I  t  satisfies  the  cond  in  where  clause  } 

We  first  consider  the  case  where  R  =  R,-  U  R'c-  If  f  G  5  then  t  will  be  replaced  by  the 
new  tuple  d  where 


<  Si  or  queryi,  c  >  A^  is  in  the  set  clause 
t[Ai,  Ci\  otherwise 


This  updates  the  tuples  at  the  user  security  level.  If  t  G  S'  then  consider  the  new  tuple 
t"  where 


<  Si  or  queryi,  c  >  Ai  is  in  the  set  clause 
t[A^,  Ci\  otherwise 


and  /"[TL]  =  c.  This  determines  the  new  (polyinstantiated)  tuples  to  be  produced  by 
the  replace.  If  there  exists  u  G  Rc  for  which  u[Api^  ,  C'l,  .  .  .,  Cn,  TL]  =  t''[ApK  , 
T'l,  .  .  .,  Cn,  TL],  replace  u  by  Else  add  f"  to  Rc  where  f  [TL]  =  c. 

The  effect  of  this  rule  is  to  update  tuples  which  may  have  been  derived  by  polyin¬ 
stantiation  from  lower  tuple.  If  the  lower  level  has  not  already  been  polyinstantiated  it 
is  done  at  this  point. 

For  instance,  consider  an  H-user’s  query  to  do  a  replace  on  relation  R.  and  the  where 
clause  contains  a  join  operation  between  two  relations  R  and  T  (i.e.,  R  N  T)  where  R, 
—  (Rl  U  Kp)  and  T  =  (T/,  U  T/y).  Let  two  sets,  S  and  S'  be: 

S  {  t  e  flfi^  {o- cond  (Rh  X  T))  } 

S'  =  {  t  G  Il/y,  {eTrond  (Rl  X  T))  > 

The  values  of  tuples  in  set  S  will  be  modified.  The  tuples  in  set  S'  represent  polyinstanti¬ 
ated  tuples  that  will  be  potentially  inserted  int.o  R/y  or  replaced  in  Ryy  after  appropriate 
values  are  modified.  In  particular,  the  tuple  class  will  become  c. 

Now  we  consider  the  effect  of  replace  on  tuples  above  the  user  class.  Let  R"  =  IJ 
{  Rc”  I  c"  >  c  },  t  G  S,  and  G  set  condition.  If  there  is  a  tuple  f  G  R"  that  is  a 
polyinstantiation  of  t,  i.e.,  tfApp,  Cpp]  =  CfApp,  Cpp],  and  f[Ci]  —  c  then  its  value 
will  be  modified  according  to  s,-  or  quenji. 
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5.2.3  Delete 

The  delete  queries  executed  by  a  c-user  have  the  following  general  form: 

delete  from  R 
[where  cond\ 

Tuples  that  satisfy  cond  in  the  where  clause  will  be  deleted  from  R^.  Said  differently, 
in  view  of  ^-property  [BeL76],  only  those  tuples  t  that  satisfy  cond  and  t[TL]  =  c  are 
deleted  from  relation  R. 

(’onsider  an  H-user’s  query  to  delete  tuples  from  relation  R  that  satisfy  a  join  con¬ 
dition  between  two  relations  R  and  T  (i.e.,  R  M  T)  where  R  =  (Rx,  U  Rh)  and  T  =  (Ti 
U  Tx/).  The  tuples  in  set  S  where  S  =  {  t  G  x  T))  }  will  be  deleted 

from  Rf/. 

Even  though  our  examples  in  this  section  use  relations  that  have  two  security  levels, 
we  can  easily  generalize  the  concept  to  relations  that  have  a  general  security  structure 
as  we  did  in  section  5.1. 

Adding  these  update  operations  to  the  SQL-like  retrieval  language  derived  from  the 
relational  algebra  yields  an  SQL-like  language,  capable  of  performing  all  basic  multilevel 
relational  database  operations. 


6  Summary 


W'p  ])resented  a  data  model  tor  the  SINTRA  architecture  database  system.  Based  on 
this  data  model,  a  simple  multilevel  relational  algebra  and  the  semantics  of  update 
operations  for  multilevel  relation  were  described. 

We  believe  that  any  attempt  to  define  multilevel  SQL  should  be  based  on  a  multi¬ 
level  relational  algebra  and  the  .semantics  of  multilevel  update  operations.  We  plan  to 
investigate  multilevel  SQL  issues  in  the  near  future. 
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The  b^/c^  problem:  How  big  buffers  overcome  covert  channel 
cynicism  in  trusted  database  systems 

J.  McDermott 

Naval  Research  Laboratory,  Code  5542,  Washington,  DC  23075,  USA 

Abstract 

We  present  a  mechanism  for  communication  from  low  to  high  security  classes  that 
allows  partial  acknowledgments  and  flow  control  without  introducing  covert  chan¬ 
nels.  By  restricting  our  mechanism  to  the  problem  of  maintaining  mutual  consis¬ 
tency  in  the  replicated  architecture  database  systems,  we  overcome  the  negative 
general  results  in  this  problem  area.  A  queueing  theory  model  shows  that  big  buffers 
can  be  practical  mechanisms  for  real  database  systems. 

Introduction 

Kang  and  Moskowitz  [8]  presented  a  general  mechanism  for  rapid  and  reliable  com¬ 
munication  from  low  to  high  security  classes.  The  mechanism,  called  the  Pump, 
includes  an  adjustable  and  easily  quantifiable  covert  channel  to  provide  acknowledg¬ 
ments.  Their  general  result  is  that  reliability,  performance,  and  security  cannot  be 
achieved  together.  This  negative  result  agrees  with  other  work  [10]  and  we  do  not 
dispute  it  here.  Instead,  we  present  positive  results  for  a  useful  special  case  of  com¬ 
munication  from  low  to  high  classes:  maintenance  of  mutual  consistency  in  repli¬ 
cated  architecture  multilevel-secure  database  systems. 

The  replicated  architecture  [6]  is  an  approach  to  providing  strong  multilevel  security 
in  database  systems.  It  provides  multilevel  security  by  replicating  single-level  copies 
of  low  sensitivity  data  into  higher  classes.  The  replicated  architectime  depends  upon 
the  ability  to  write-up  reliably  without  creating  an  undesirable  information  flow. 
According  to  the  Bell-LaPadula  model  [1],  write-up  without  read  access  is  permissi¬ 
ble.  This  kind  of  write-up  is  performed  to  volatile  storage,  without  acknowledgment. 
Furthermore,  it  requires  the  use  of  memory  descriptors  and  mechanisms  that  do  not 
carry  read  access  permission  to  the  destination  memory  segment,  a  feature  rarely 
supported  by  existing  hardware.  The  latter  problem  can  be  overcome  by  simiflating 
the  write-up  with  a  read-down,  but  the  lack  of  coordination  and  the  volatile  nature  of 
the  destination  memory  segment  remain  problematic. 

By  exploiting  the  structure  of  a  computation,  Sandhu,  Thomas,  and  Jajodia  [13,  14] 
have  shown  how  write-up  without  acknowledgment  can  be  used  in  object-oriented 
systems.  Kang  and  Moskowitz  have  proposed  a  general  mechanism  for  writing  up 
reliably  with  recovery  by  using  acknowledgment  with  a  controlled  bandwidth^  covert 


1.  Moskowitz  and  Kang  [12]  argue  that  the  concept  of  bandwidtli  is  not  a  sufficiently  precise  measure  of  the 
vulnerability  introduced  by  a  covert  channel  and  provide  a  new  metric,  the  small  message  criterion  (SMC). 
Tlie  small  message  criterion  depends  on  a  triple  (n,  X,  p):  when  a  covert  channel  exists  in  a  system,  the  SMC 
gives  guidance  for  what  will  be  tolerated  in  tenns  of  covertly  leaking  a  short  message  (e.g.  master  key)  of 
length  n  bits  in  time  X  witli  fidelity  of  transmission  p%. 
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channel.  Kang  and  Moskowitz  assert  that,  for  the  general  case,  one  cannot  have 
write-up  that  is  reliable,  recoverable  and  secure.  The  thesis  here  is  that,  for  an 
important  special  case,  this  is  not  so.  Our  special  case  is  write-up  performed  for  the 
purpose  of  maintaining  mutual  consistency  in  the  replicated  architecture. 

Three  advantages  of  restricting  our  solution  to  the  replicated  architecture  are:  1) 
boimded  storage  space  requirements  at  the  destination,  2)  a  relatively  small  number 
of  source  and  destination  processes,  3)  transaction  management.  Because  we  are 
only  writing  up  for  the  purpose  of  replicating  data  items  in  a  database,  we  know  that 
no  new  objects  are  created  by  writing  up  to  higher  classes^.  We  can  fix  the  total  stor¬ 
age  available  at  lower  security  classes  and  thus  bound  the  total  replicated  storage 
for  all  higher  security  classes.  Because  we  are  only  supporting  database  system 
instances,  we  know  there  will  not  be  a  large  number  of  readers  and  writers^. 
Because  we  are  only  supporting  systems  with  transaction  management  capability, 
we  can  choose  to  discard  some  write-ups  in  a  correct  fashion,  in  the  event  of  a  failure, 
and  bring  the  replicas  into  convergence  with  later  transactions.  This  latter  point  is 
proved  by  Bernstein,  Hadzilacos,  and  Goodman  [2]. 

Our  specific  problem  is  to  provide  a  service  for  propagating  update  projections  in  the 
replicated  architecture  database  system.  This  service  is  to  be  reliable,  recoverable, 
and  secure.  By  secure  we  mean  free  from  implementation  invariant  covert  channels 
and  compliant  with  a  Bell-LaPadula  access  control  policy,  as  discussed  by  [6] .  By 
recoverable  we  mean  that  write-ups  accepted  by  the  service  are  completed  in  the 
event  of  a  system  failure.  By  reliable  we  mean  that,  if  a  write-up  is  requested,  the 
requestor  can  know  if  the  write  succeeded  or  failed,  that  is,  acknowledgments  are 
given  to  the  writer. 

We  conclude  this  section  with  some  definitions.  In  the  following  sections  we  review 
the  Pump  mechanism,  define  the  basic  write-up  service,  discuss  necessary  buffer 
size,  present  some  usability  enhancements,  and  discuss  our  conclusions. 

In  our  discussion,  we  assume  that  all  processes  use  stable  storage  in  a  recoverable 
way.  Stable  storage  [2]  is  storage  that  is  not  affected  by  a  system  failure,  e.g.  disk 
storage.  Volatile  storage  is  storage  that  is  affected  by  system  failure;  system  failures 
cause  the  loss  of  possibly  all  of  the  contents  of  volatile  storage.  Stable  and  volatile 
are  relative  terms;  we  could  consider  off-line  tape  storage  as  stable  and  disk  storage 
as  volatile  because  disk  hardware  failures  do  not  affect  the  off-line  tapes.  A  more 
precise  definition  would  distract  us  from  our  point.  When  we  say  that  processes  use 
stable  storage  in  a  recoverable  way,  we  mean  that  they  keep  their  data  on  stable 
storage,  and  follow  the  usual  approaches  to  logging  and  caching  in  volatile  storage 
[2]  to  ensure  that  their  data  can  be  recovered  after  a  crash. 

The  Pump 

The  Pump  prowdes  communication  from  a  low  source  process  to  a  high  destination 
process.  It  is  a  trusted  mechanism  with  three  components:  trusted  low  buffer  TLB, 
trusted  high  buffer  THB,  and  commxmication  buffer  CB.  A  low  source  process  sends 


1.  Yes,  tliere  is  a  problem  witli  multilevel  transactions  tJiat  will  be  discussed  in  the  conclusion. 

2.  Readers  and  writers  being  database  system  server/data  manager  instances. 
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a  message  to  a  high  destination  process  by  first  passing  it  to  the  trusted  low  buffer 
TLB,  which  then  gives  the  message  to  the  communication  buffer  CB.  When  mes¬ 
sages  are  in  the  CB,  the  trusted  high  buffer  THB  signals  the  high  destination  pro¬ 
cess  and  passes  the  message  to  it.  Acknowledgments  (ACK)  and  negative 
acknowledgments  (NAK),  and  time-outs  are  used  between  THB  and  the  destination 
and  between  TLB  and  the  source.  These  acknowledgments  are  necessary  for  reliabil¬ 
ity  and  recoverability.  They  can  be  exploited  as  a  covert  channel  because  the  destina¬ 
tion  process  can  modulate  its  ACK  and  NAK  messages  (or  time-outs)  to  leak 
sensitive  information  to  low.  The  Pump  itself  is  trusted  and  cannot  be  exploited  in 
this  way.  Figure  1  shows  the  Pmnp. 


Figure  1.  Message  Passing  From  Low  to  High  Using  the  Pump. 

Kang  and  Moskowitz  throttle  the  covert  channel  by  delaying  the  acknowledgments 
from  the  high  destination  process  in  a  way  that  gives  approximately  the  same 
expected  (mean)  response  time  but  significantly  reduces  the  influence  that  the  high 
destination  process  has  on  individual  response  times.  The  delay  is  added  via  a  ran¬ 
dom  variable  with  a  modified  exponential  distribution.  By  computing  a  moving  aver¬ 
age  they  control  the  capacity  of  the  channel  and  further  complicate  matters  for 
Trojan  horses. 

The  Write-Up  Service 

Now  we  look  at  a  write-up  service  that  does  not  incorporate  a  covert  channel  in  its 
mechanism  but  nevertheless  also  provides  effective  reliability  and  recoverability. 
Like  the  pump,  ovu  write-up  service  also  depends  upon  trusted  software.  The  key 
point  of  the  trust  is  that  trusted  software  will  only  send  legitimate  control  messages 
(i.e.  NAK  is  only  sent  when  a  write-up  fails).  Protocol  events  caused  by  the  write-up 
service  are  not  due  to  a  Trojan  horse.  We  prevent  modulation  of  the  write-up  service 
itself  by  disconnecting  the  flow  of  acknowledgments  from  high  to  low,  and  compen¬ 
sating  for  this  by  providing  a  probabilistic  form  of  guaranteed  delivery. 

The  service  provides  a  set  of  write-up  ports  to  the  low  process,  that  is,  the  writer.  It 
provides  a  different  set  of  receive  ports  to  the  high  process  that  acts  as  the  destina¬ 
tion.  The  service  maintains,  in  stable  storage,  a  buffer  to  store  the  messages.  The 
service  follows  a  fairly  conventional  protocol,  except  there  is  no  acknowledgment 
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from  the  destination  process  to  the  source  process; 

The  buffer  slots  can  be  either  full  or  free  and  a  message  in  the  buffer  can  be  removed 
from  a  receive-down  port  or  overwritten  by  the  write-up  service.  The  low  source  pro¬ 
cess  is  allowed  to  query  the  write-up  service  regarding  the  status  of  a  buffer  slot,  but 
not  the  status  of  a  message  in  the  buffer  slot.  The  high  destination  process  can  query 
the  status  of  buffer  slots  and  messages  in  the  buffer  slots.  The  buffer  starts  with  all 
slots  free  and  no  messages  in  the  slots. 

1.  Low  cormectvS  to  a  write-up  port. 

2.  High  connects  to  a  receive-down  port  by  specifjdng  the  kinds  of  messages  it 
v’ants  to  receive, 

3  Lov/  SGnd.«  a  me  ssage.  If  the  message  is  received  by  the  trusted  write-up  senhce 
then  an  Af'K.  is  sent  to  low,  the  message  is  placed  in  a  free  buffer  slot,  the  slot  is 
iTiarksd  and  low  may  discard  its  copy  of  the  message.  If  the  message  is  not 
reuer  'sd  ihe  write-up  seirdce  then  tl\e  trusted  write-up  service  will  either 

send  NAK  or  low  will  time-nut.  In  eit  her  failure  case  low  retries  the  write-up.  If 
the  buffer  is  full,  that  is  no  free  buffer  slots  are  available,  the  write-up  service 
will  tell  low  to  wait. 

4.  The  write-up  service  signals  or  interrupts  the  high  process  to  notify  it  that  a 
message  has  arrived  from  low.  After  either  a  fixed  or  random  time  interval,  the 
message’s  buffer  slot  is  marked  free.  Freeing  a  buffer  slot  does  not  remove  a 
message  via  a  receive-down  port. 

5.  High  removes  the  message  from  its  receive-down  port.  The  write-up  service 
does  not  tell  low  that  the  message  has  been  removed  from  the  port.  Removing  a 
message  does  not  free  its  corresponding  buffer  slot. 

To  summarize,  if  we  define  a  message  in  a  free  slot  as  discarded,  denoting  this  condi¬ 
tion  as  dis,  removed  from  a  receive-down  port  as  rem,  and  overwritten  as  over,  we 
have  six  possible  message  conditions: 


dis  and  not  rem  and  over 

(1) 

dis  and  not  rem  and  not  over 

(2) 

dis  and  rem  and  over 

(3) 

dis  and  rem  and  not  over 

(4) 

not  dis  and  not  rem 

(5) 

not  dis  and  rem 

(6) 

Steps  three,  four,  and  five  can  be  repeated  rmtil  either  high  or  low  decides  to  end  the 
write-up  session  and  disconnects.  Flow  control  can  be  improved  by  overlapping  sev¬ 
eral  acknowledgments  with  a  sliding  window  protocol.  The  low  source  processes  are 
allowed  to  know  how  large  the  buffer  is  and  when  it  is  full,  that  is,  they  can  be  legit¬ 
imately  blocked  when  the  buffer  is  full  because  the  state  of  the  buffer  does  not 
depend  on  the  destination  process.  Figure  2  shows  the  components  of  the  write-up 
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service. 


low 

source 

process 


service  process  (trusted) 


buffer  -  private  to 
service  process 


high 

destination 

process 


Figure  2.  Basic  write-up  service 

This  protocol  provides  communication  with  conventional  flow  control  between  the 
source  and  the  service  process.  We  could  even  have  an  incremental  improvement  in 
the  overall  performance  and  reliability  by  having  the  service  process  send  the  mes¬ 
sages  (instead  of  a  signal)  and  conduct  a  separate  flow  control  protocol  with  the  des¬ 
tination,  as  long  as  this  protocol  did  not  change  the  rate  at  which  buffer  slots  were 
freed  by  the  service  process.  The  flow  control  is  not  modified  by  random  extensions  of 
the  delay  associated  with  sending  a  message.  There  is  no  covert  channel  due  to 
acknowledgments  sent  from  the  high  destination  process  to  the  low  source  process 
because  there  are  none.  If  a  malicious  destination  process  refuses  to  receive  mes¬ 
sages,  then  the  messages  are  overwritten^.  Thus  performance  and  security  exceed 
that  of  the  more  general  Pump  mechanism.  As  we  shall  show  next,  the  reliability 
and  recoverability  can  be  made  arbitrarily  good. 

Big  Buffers 

Our  write-up  service  depends  on  careful  buffer  management  to  avoid  overwriting  a 
message,  condition  (1)  above.  In  normal  operation  and  during  short  term  failure,  the 
success  of  our  approach  depends  on  being  able  to  establish  a  big  (enough)  buffer.  As 
we  will  show,  it  is  possible  to  determine  the  probability  of  overwriting  an  update  pro¬ 
jection,  as  a  function  of  the  buffer  size  and  system  load.  Because  of  this  we  can  chose 
a  buffer  size  that  makes  the  buffer  practically  infinite. 

Let  us  define  a  catastrophic  failure  K  as  a  failure  of  a  database  system  that  causes 
parts  of  some  transactions  to  be  lost  and  the  database  system  to  produce  an  incorrect 
history.  This  can  happen  even  when  correct  transaction  processing  mechanisms  are 
used  because  the  failure  (most  likely  a  combination  of  failures)  causes  one  of  the 
underlying  assumptions  to  be  untrue  (e.g.  hardware  failure  or  single-event  upset  in 
the  running  software).  Because  of  the  transaction  processing  mechanisms  and  the 
care  taken  in  designing  and  implementing  the  system  we  expect  the  probability  of 
catastrophic  failure  K  to  be  relatively  small.  Now  define  as  the  probability  that  an 
update  projection  will  be  overwritten.  If  the  size  L  of  the  buffer  is  sufficiently  large  so 


1 .  We  make  no  claim  to  protect  against  denial  of  service,  but  such  behavior  would  be  detected  quickly  and 
tlie  offending  software  removed. 
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thatp(jj<  p-^,  then  we  say  the  buffer  is  a  big  buffer. 

How  big  does  a  buffer  need  to  be  to  be  a  big  buffer?  To  answer  this  we  model  the  des¬ 
tination  process  as  a  server  in  an  M/M/l  queuing  model^,  where  the  queue  is  finite. 
Recall  that  M/M/l  queueing  models  have  exponentially  distributed  arrival  and  ser¬ 
vice  rates,  a  single  server,  and  are  used  to  find  steady-state  values.  The  mean  arrival 
rate  of  the  service  requests  (write-ups)  is  denoted  X  and  the  mean  service  rate 
(removal  of  messages  by  the  destination  process)  is  denoted  |I.  We  call  the  ratio  X/|Ll 
the  offered  load  (imposed  on  the  system)  and  denote  it  by  a.  Offered  load  a  repre¬ 
sents  the  relative  load  on  the  system  and  is  measured  in  imits  called  erlangs.  As  a 
concrete  example  of  how  offered  load  a  relates  to  performance  we  can  find  the  delay 
for  a  particular  offered  load  on  our  write-up  system,  using  Little’s  Law  [9] .  Let  L  be 
the  mean  number  of  requests  in  a  queue  or  in  the  server  and  W  the  mean  length  of 
time  it  takes  request  to  pass  through  the  system  (i.e.  sojourn  time);  then 

L=%W  (7) 

for  a  wide  range  of  queueing  models,  including  the  M/M/l  model  with  finite  queue 
size. 

For  finite  queues,  the  easiest  way  to  apply  Little’s  Law  is  to  calculate  the  mean  num¬ 
ber  of  requests  in  the  queue  directly.  Queueing  theory  [3]  gives  us  the  probability  p-^ 
of  n  update  projections  being  present  in  the  finite  queue  as 

Pt\-  (l-a)a^/(l-a“^^‘^^)  for  0<n<max 

Pn-  0  for  n>max  (8) 

where  max  is  the  size  of  the  buffer.  We  then  compute  the  mean  number  of  requests  in 
the  queue  as  L  =1,  n-p^. 

0£ii<max 

So,  if  our  write-up  system  was  receiving  one  update  projection  per  second  on  the 
average  (i.e.  ^==1.00),  the  buffer  size  was  600  update  projections,  and  the  offered  load 
was  a=0.99  erl,  then,  by  Little’s  Law,  a  write-up  would  take  roughly  two  and  a  half 
minutes  to  propagate,  on  the  average.  Practical  systems  operate  with  much  smaller 
offered  loads;  for  example  if  we  take  a=0.5  erl,  the  update  projection  propagates  in 
about  one  second.  These  values  would  hold  even  in  an  untrusted  system  that  could 
use  conventional  flow  control  protocols. 

If  we  set  n=max  in  equation  (8),  we  get  p^iax  the  probability  of  a  full  buffer.  Since  a 
full  buffer  causes  an  overwrite,  we  can  treat  Pniax  Pm  probability  of  overwriting  an 

update  projection.  Figure  1  shows  a  plot  of  buffer  size  as  a  function  of  offered  load. 


1 .  Besides  being  tractable,  tliis  model  is  appropriate  because  tlie  source  and  destination  processes  in  a  repli 
cated  architecture  arc  essentially  the  same,  though  possibly  loaded  differently.  With  respect  to  tractability, 
our  current  model  of  a  finite  M/G/1  queue  must  be  run  overnight  to  compute  a  single  data  point.  Its  results 
tend  to  agree  witli  tlie  more  tractable  M/M/l  model. 
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for  a  range  of  overwrite  probabilities. 


Figure  3.  Buffer  size  as  a  function  of  offered  load  for  overwrite  probabilities  of  10"®, 

10'^^,  10'^^  and 

As  we  see  from  Figure  3,  a  buffer  size  between  100  and  500  is  sufficient  for  offered 
loads  as  high  as  a=0.95  erl  with  a  overwrite  probability  of  10'^^,  a  condition  where 
the  average  delay  for  our  previous  example  is  about  28  seconds.  Since  there  are 
3.1536  X  10^  seconds  in  a  year,  it  is  unlikely  that  our  write-up  system  will  have  an 
overwrite  during  its  useful  lifetime.  Significant  increases  in  reliability  can  be 
obtained  for  relatively  small  increases  in  buffer  size.  If  we  reduce  our  overwrite  prob¬ 
ability  to  10'^®,  we  only  need  a  buffer  size  of  about  600  at  a  offered  load  of  a=0.95  erl. 

Since  update  projections  are  relatively  small  objects  (an  average  size  of  IK  bytes  is 
quite  generous  for  a  logical  update  projection^  )  provision  of  big  buffers  is  practical. 

In  practice  the  average  size  of  an  update  projection  is  likely  to  be  an  order  of  magni¬ 
tude  smaller.  Even  if  our  update  projections  were  IK  bytes,  we  would  only  need 
about  600K  bytes  of  buffer  storage  for  a  write-up  service. 

Because  buffer  exhaustion  cannot  be  used  to  communicate,  we  assume  no  attempt  by 
the  untrusted  sender  or  receiver  to  fill  up  the  buffer  in  order  to  cause  an  unautho¬ 
rized  information  flow.  This  justifies  our  use  of  conventional  models  based  on  inde¬ 
pendent  arrival  and  service  times,  and  conventional  steady  state  values. 


1 .  A  logical  update  projection  is  implemented  by  sending  die  text  of  an  update  transacdon  rather  than  the 
physical  writes  it  generates. 


Recoverability 

From  the  perspective  of  the  source  processes  and  the  write-up  service  proper,  the 
write-up  service  appears  to  handle  system  failures  just  as  a  conventional  system 
would.  Messages  in  the  buffer  are  in  stable  storage;  transactional  logging  procedures 
can  be  used  to  restore  the  buffer  in  the  event  of  a  system  failure.  If  source  processes 
use  similar  techniques,  they  can  retransmit  messages  that  were  not  acknowledged 
by  the  write-up  service.  For  this  reason,  the  write-up  service  provides  recoverability 
for  failmes  of  the  source  processes  and  of  the  write-up  service  itself;  we  will  not  dis¬ 
cuss  it  further. 

The  question  of  recoverability  with  respect  to  failure  of  the  destination  process  is 
more  interesting.  Our  basic  approach  is  to  make  the  write-up  service  buffer  large 
enough  to  hold  all  the  messages  that  may  be  sent  before  a  destination  process  can 
recover.  Here  we  are  only  able  to  succeed  because  we  restrict  the  problem  to  repli¬ 
cated -architecture  database  systems.  Because  we  are  using  source  processes  with 
finite  memory  we  know  we  will  have  to  chose  to  discard  some  update  projections 
(write-up  messages)  in  the  event  of  a  long-term  destination  process  failure.  This 
choice  has  nothing  to  do  with  write-up  strategies  but  rather  with  the  finite  capacity 
of  the  source  process.  The  source  must  continue  to  process  new  updates  at  its  own 
security  class  and  it  must  eventually  run  out  of  space  to  store  the  new  update  projec¬ 
tions  it  wishes  to  propagate  and  so  must  discard  some  of  them.  This  is  not  a  problem; 
the  same  choice  is  made  for  conventional  distributed  database  systems  [2,  §  8.5].  For 
this  reason,  if  we  restricted  ourselves  to  use  of  the  Pump,  we  would  still  have  to  dis¬ 
card  some  write-up  projections  in  the  event  of  long-term  failure^. 

Since  we  know  some  update  projections  will  have  to  be  discarded  in  some  cases,  we 
can  chose  to  define  a  short-term  failure  to  be  one  that  fits  our  desired  range  of  offered 
loads.  That  is,  if  we  expect  offered  load  a  to  be  small  and  failures  to  be  infrequent,  we 
can  define  “short”  as  a  longer  period  of  time  than  if  we  expect  frequent  failures  of  the 
destination  process  or  if  we  expect  offered  load  a  to  be  relatively  large.  In  any  case, 
in  determining  the  required  buffer  size,  we  simply  treat  short-term  failures  as  addi¬ 
tional  write-up  requests  that  tie  up  the  system  for  some  period  of  time  equal  to  the 
time  needed  to  detect  and  recover  from  the  failime. 

Usability  Enhancements  To  The  Basic  Sen/ice 

There  are  some  non-critical  enhancements  we  can  make  to  the  basic  service  to 
improve  its  usability  in  practical  systems:  message  time-outs,  overwrite  priorities, 
and  variable  buffer  sizes. 

First,  we  can  make  it  easier  for  the  destination  process  to  manage  its  rate  of  message 
receipt  and  the  write-up  service  to  adjust  its  buffer  size.  To  do  this,  we  set  a  timer  for 
each  message  when  it  is  accepted  by  write-up  service.  Messages  received  for  write-up 
are  stored  until  they  either  expire  or  they  are  received  by  a  high  destination  process. 
If  a  message  is  received  it  is  marked  as  such  but  is  not  removed  from  the  buffer.  Only 
high  destination  processes  can  tell  if  a  message  has  been  received.  The  time-out 


1.  Recall  tliat  one-copy  serializability  does  not  require  all  writes  to  update  all  copies. 
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period  for  messages  is  fixed  at  system  generation  time.  Upon  time-out  the  message 
expires,  is  marked  as  such,  and  it  may  be  overwritten  or  discarded  by  the  write-up 
service.  An  expired  message  may  be  received  and  a  received  message  will  expire. 
Only  expired  messages  are  overwritten  or  discarded.  A  human  user  (database 
administrator  or  system  security  officer)  can  monitor  the  performance  of  the  destina¬ 
tion  process  with  respect  to  the  time-outs  and  adjust  the  buffer  size  of  the  write-up 
service  if  necessary.  If  we  use  this  option,  we  want  to  hide  the  buffer  size  from  the 
source  process. 

A  second  enhancement  we  can  make  will  improve  the  ability  of  the  write-up  service 
to  ensure  that  critical  messages  are  more  likely  to  be  received.  The  write-up  service 
can  provide  a  priority  parameter  that  indicates  the  criticality  of  the  message.  If  an 
overwrite  is  necessary,  the  lower  priority  messages  will  be  overwritten  first. 

A  third  enhancement  we  can  make  is  to  provide  variable  buffer  size.  The  write-up 
service  does  not  need  to  maintain  a  fixed  buffer  size,  if  the  buffer  size  is  not  visible  to 
the  sending  processes.  With  this  approach,  the  write-up  service  would  maintain  an 
estimate  of  offered  load  a  and  adjust  the  buffer  size  as  needed.  In  the  case  of  a  full 
buffer,  the  write-up  service  would  first  try  to  expand  the  buffer  and  then  overwrite 
an  earlier  message  if  no  more  free  space  were  available.  Our  model  shows  that  this  is 
possible,  since  big  buffers  will  fit  easily  into  the  stable  storage  space  available  on  a 
dedicated  frontend  or  replica  controller. 

It  is  also  possible  to  let  the  source  process  know  the  size  of  the  buffer  used  by  the 
write-up  service  but  still  vary  the  effective  buffer  size.  If  the  destination  process  is 
designed  as  an  interrupt  handler  (i.e.  a  small  program  that  quickly  removes  data 
from  a  port  and  then  schedules  work  for  a  larger  process  that  uses  the  data)  it  can 
have  a  variable  size  buffer.  This  second  buffer  can  be  implemented  at  the  destination 
security  class  and  thus  will  be  invisible  to  the  source  process.  The  destination  pro¬ 
cess  buffer  can  vary  according  to  a,  and  the  destination  process  will  only  be  responsi¬ 
ble  for  receiving  write-ups  from  the  service.  The  replicated  architecture  database 


10 


system  can  then  accept  update  projections  from  the  destination  process. 


low 

source 

process 


high 

buffer  handler 


service  process  (trusted)  process 


Figure  4.  Improved  write-up  service  with  variable-size  buffer 

Conclusions 


The  Pump  represents  a  good  general  mechanism  for  writing  up.  However,  we  believe 
that  practical  covert-channel-free  alternatives  exist  for  the  special  case  of  the  repli¬ 
cated  architecture,  with  comparable  or  better  time  performance  without  a  meaning¬ 
ful  sacrifice  of  reliability  and  recoverability.  The  write-up  service  we  have  described 
here  is  one  alternative.  Our  alternative  mechanism  has  the  same  performance  as  an 
untrusted  conxmunication  mechanism,  that  is,  no  delays  are  introduced.  It  has  no 
covert  channel  due  to  acknowledgments.  There  is  a  nonzero  probability  of  overwrit¬ 
ing  a  message,  but  the  reliability  and  recoverability  can  be  made  arbitrarily  good  by 
appropriate  choice  of  buffer  size.  Acknowledgements  and  flow  control  can  be 
extended  to  cover,  separately,  both  source-to-service  and  service-to-destination  com¬ 
munications.  We  can  easily  make  the  write-up  mechanism  more  reliable  than  the 
system  it  supports.  It  is  not  clear  that  an  adjustable  covert  channel  can  be  set  to  be 
smaller  than  the  smallest  covert  channel  that  might  be  exercised,  particularly  in 
light  of  the  small  message  criterion  of  Moskowitz  and  Kang.  On  the  other  hand,  we 
must  admit  that  our  write-up  service  is  not  general,  but  limited  to  an  important  spe¬ 
cial  case  and  may  not  apply  to  other  special  cases. 

Stable  storage  is  inexpensive  compared  to  the  cost  of  developing  new  applications  on 
high-assurance  trusted  systems.  This  is  the  same  justification  for  the  replicated- 
architecture  approach,  which  is  the  place  we  expect  this  service  to  be  used.  Where 
the  offered  load  a  is  less  than  1.1  erl,  we  can  provide  the  desired  reliability  within 
the  bormds  of  conventional  disk  systems.  In  the  more  likely  case,  where  the  offered 
load  a  is  less  than  0.95  erl,  we  can  succeed  with  buffers  whose  size  is  between  10^ 
and  10^  update  projections. 

Our  write-up  service  has  not  been  specified  so  that  it  can  deal  with  creation  of  new 
data  items  at  higher  security  classes.  This  kind  of  operation  is  not  available  in  the 
current  SINTRA  prototype  [7],  but  is  necessary  for  fully  general  multilevel  transac- 
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of  new  data  items  via  blind  write-up,  but  this  seems  less  than  satisfactory.  Future 
work  should  investigate  models  and  mechanisms  for  extending  big  buffer  write-up  to 
handle  creation  of  new  data  items  in  a  more  elegant  fashion.  We  also  plan  to  look  at 
more  advanced  models  of  buffer  size,  such  as  M/G/1  queues  with  finite  buffers. 
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Abstract 

In  this  paper,  we  present  a  design  for  a  multilevel  secure  (MLS)  database  manage¬ 
ment  system  (DBMS)  intended  to  meet  the  Class  B2  requirements  of  the  Department  of 
Defense  Trusted  Computer  System  Evaluation  Criteria.  Our  design  approach  allows  us 
to  support  client-server  operation  without  introducing  trusted  code.  We  also  present  a 
discussion  of  the  issues  that  arise  in  the  development  of  a  multilevel  secure  client-server 
DBMS  and  an  analysis  of  the  relationship  between  client-server  architectural  design 
choices  and  assurance. 


1  Introduction 

There  is  a  significant  requirement  for  multilevel  secure  (MLS)  database  management  system 
(DBMS)  technology  in  the  Department  of  Defense  (DOD)  and  the  intelligence  community. 
A  multilevel  secure  DBMS  is  a  DBMS  that  can  store  and  process  information  at  multiple 
security  levels  and  serve  multiple  users  some  of  whom  may  not  be  cleared  for  all  the  in¬ 
formation  in  the  system.  The  area  of  multilevel  database  management  has  been  actively 
pursued  in  tlie  research  community,  and  significant  progress  has  been  made  in  the  theory 
and  practice  of  multilevel  DBMS  design  and  implementation. 

The  Client-Server  model  is  fast  becoming  the  predominant  model  of  database  access.  A 
client-server  architecture  offers  many  advantages  over  traditional  mainframe-based  mono¬ 
lithic  architectures,  among  these  are:  improved  access  to  shared  data  resources,  more  ef¬ 
fective  use  of  computational  resources,  support  for  incremental  growth,  and  support  for 
Open  System  standards  resulting  in  greater  database  interoperability.  However,  multilevel 
secure  database  technology  and  implenumtation  has  lagged  behind  advances  in  Open  Sys¬ 
tems  distributed  computing  architectures.  The  result  is  that  users  must  sacrifice  Open 
Systems/distributed  computing  solutions  in  order  to  meet  security  requirements. 

In  this  paper,  we  report  on  an  effort  to  design  and  implement  a  multilevel  secure  client- 
server  DBMS  intended  to  satisfy  the  Dei)artment  of  Defense  Trusted  Computer  System 
Evaluation  Criteria  (TCSEC)  functionality  and  assurance  requirements  for  a  Class  B2  com¬ 
puter  system  [1,  2].  The  starting  point  for  this  effort  was  an  existing  commercial  relational 
DBMS  product  called  RUBIX.  An  earlier  phase  of  this  effort  focused  on  reengineering  RU¬ 
BIX  to  satisfy  the  B2  requirements.  The  results  of  that  phase  are  described  in  [3].  This 
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paper  describes  progress  since  then  in  migrating  the  standalone  version  of  Ti-usted  RUBIX 
to  a  client-server  architecture. 

In  section  2,  we  provide  a  brief  overview  of  the  standalone  version  of  Trusted  RUBIX 
that  was  the  implementation  basis  for  the  client-server  version  of  Timsted  RUBIX.  In  section 
3,  we  discuss  multilevel  secure  client-server  DBMS  design  issues.  In  section  4,  we  describe 
the  design  for  the  client-server  version  of  Ti'usted  RUBIX.  In  section  5,  we  discuss  the 
relationship  between  client-server  architectural  choices  and  assurance.  In  section  G,  we 
present  conclusions  and  discuss  future  research. 

2  Overview  of  Standalone  Trusted  RUBIX 

The  starting  point  for  our  client-server  implementation  effort  was  a  standalone  version  of 
Trusted  RUBIX  running  on  the  AT&T  3B2  running  UNIX  System  Laboratories’  (USL) 
System  V  Release  4.1  ES  (SV4.1ES).  Since  this  system  forms  the  core  of  the  server  in  our 
design,  we  provide  a  brief  overview  of  its  policy  and  architecture. 

2.1  Security  Policy 

The  Trusted  RUBIX  security  policy  is  an  adaptation  of  the  policy  developed  as  part  of 
the  SeaView  project  [4].  The  mandatory  access  control  (MAC)  policy  is  a  straightforward 
interpretation  of  the  Bell  and  LaPadula  [5]  model.  Subjects  are  operating  system  processes 
running  on  behalf  of  users.  Each  subject  has  a  security  level  that  is  dominated  by  the  cor¬ 
responding  user’s  clearance.  The  MAC  objects  are  databases,  schemata,  relations,  indexes, 
view  definitions,  access  control  lists,  and  tuples.  A  subject  can  read  an  object  if  its  security 
level  dominates  the  security  level  of  the  object.  A  subject  can  write  an  object  if  its  security 
level  is  equal  to  the  security  level  of  the  object. 

Discretionary  access  control  (DAC)  is  enforced  in  Trusted  RUBIX  by  allowing  users 
to  specify  which  users  and  groups  are  authorized  for  specific  access  modes  (privileges)  on 
database  objects.  The  DAC  objects  in  Trusted  RUBIX  are  databases,  schemata,  stored 
relations,  and  views.  Different  modes  of  access  are  permitted  on  different  objects  (e.g., 
select,  insert,  delete,  and  update  are  some  of  the  modes  supported  on  tables).  A  special 
NULL  access  mode  is  used  to  support  the  explicit  denial  of  access. 

The  details  of  the  Trusted  RUBIX  adaptation  and  interpretation  of  the  SeaView  policy 
are  discussed  in  [3].  A  user- level  description  of  the  Trusted  RUBIX  protection  mechanisms 
is  provided  irr  [6]. 

2.2  System  Architecture 

The  Trusted  RUBIX  architecture  is  based  on  the  concept  of  a  protected  subsystem  [7].  All 
Ti'usted  RUBIX  data  are  stored  in  one  or  more  volumes,  which  are  single-level  operating 
system  objects.  To  support  fine-grained  multilevel  objects  (viz.,  tuples),  labels  are  attached 
to  individual  database  items  within  each  operating  system  object.  Note  that  these  labels  are 
DBMS  labels  not  operating  system  labels  the  operating  system  views  these  labels  strictly 
as  data  and  attaches  no  security  significance  to  them.  The  DBMS  is  trusted  to  properly 
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associate  and  maintain  the  label  of  each  item  and  to  correctly  interpret  those  labels  so 
that,  in  cooperation  with  the  operating  system  trusted  computing  base  (TCB),  the  security 
policy  can  be  correctly  enforced.  To  encapsulate  protected  data  and  to  implement  the 
DBMS  security  policy,  this  architecture  employs  trusted  subjects.  A  trusted  subject  is  a 
subject  that  runs  with  special  privilege  and  can  bypass  the  operating  system’s  security 
policy  whenever  this  is  necessary  to  implement  the  DBMS  security  policy. 

Each  Trusted  RUBIX  volume  is  encapsulated  using  two  mechanisms.  The  first  mecha¬ 
nism  is  to  add  the  special  category  R.UBIX  to  the  volume  security  label.  Only  subjects  in 
the  Tiusted  RUBIX  TCB  can  run  with  this  category  in  their  label.  The  second  mechanism 
used  to  protect  database  volumes  is  to  make  them  accessible  only  to  the  reserved  group 
RUBIXTP.  Access  to  this  group  can  only  be  obtained  when  invoking  a  program  in  the  RU¬ 
BIX  TCB  (via  the  UNIX  setgid  mechanism).  These  mechanisms  ensure  that  only  subjects 
in  the  RUBIX  TCB  can  directly  access  volume  data  (i.e.,  the  TCB  is  non-bypassable). 

The  programs  that  make  up  the  Trusted  RUBIX  TCB  are  also  protected  by  two  mech¬ 
anisms.  First,  they  are  MAC  protected  from  unauthorized  modification  by  labeling  them 
with  the  hierarchical  level  USER_PUBLIC  which  is  dominated  by  the  level  of  all  untrusted 
processes.  Second,  these  programs  are  protected  by  installing  them  with  execute  only  per¬ 
missions.  When  in  execution,  these  programs  are  protected  by  the  underlying  Trusted  UNIX 
process  isolation  mechanism.  These  mechanisms  ensure  that  the  TCB  is  tamper-resistant. 

The  implementation  of  the  Trusted  RUBIX  TCB  employs  a  technique  known  as  privilege 
bracketing.  Privilege  bracketing  refers  to  the  procedure  of  explicitly  acquiring  a  privilege 
immediately  before  a  system  call  requiring  that  privilege  and  releasing  the  privilege  imme¬ 
diately  after  the  call  completes.  The  motivation  for  privilege  bracketing  is  to  minimize  the 
execution  time  during  which  a  trusted  process  holds  a  privilege,  thereby  supporting  the 
least  privilege  principle  within  the  DBMS  TCB. 


3  Multilevel  Secure  Client-Server  DBMS  Design  Issues 

There  are  a  large  number  of  design  issues  that  arise  in  developing  an  multilevel  secure 
client-server  DBMS  architecture  to  satisfy  higher  assurance  levels  (e.g.,  B2).  Many  of  these 
issues  are  not  security  relevant,  or  also  arise  in  the  context  of  a  centralized  multilevel  secure 
DBMS.  The  following  are  some  of  the  security-relevant  architectural  issues  that  are  unique 
to  multilevel  secure  client-server  database  management  systems: 

•  Placement  of  trusted  DBMS  code  on  the  client  machine.  It  is  desirable  to  have  a 
design  that  limits  all  security  relevant  DBMS  functions  to  the  server.  The  reason  for 
this  is  that  it  is  easier  to  reason  about  the  correctness  of  a  security  mechanism  that 
is  not  distributed  between  client  and  server,  and  therefore,  a  higher  level  of  assurance 
in  its  correctness  can  be  obtained.  One  difficulty  in  attaining  this  limitation  is  that  it 
requires  a  relatively  sophisticated  secure  distributed  computing  infrastructure  (SDCI) 
to  implement.  Another  difficulty  is  that  it  imposes  limits  on  client  functionality  (e.g., 
support  for  trusted  applications). 
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•  DBMS  or  OS/network  identification  and  authentication.  In  a  client-server  DBMS, 
there  must  be  some  way  for  the  server  to  identify  and  authenticate  the  user.  This 
function  can  be  performed  by  the  DBMS  itself  (e.g.,  by  sending  a  user  identifier 
and  password  to  the  server)  or  the  DBMS  can  utilize  the  underlying  SDCI  in  such 
a  way  that  DBMS-based  authentication  is  unnecessary.  The  problem  with  the  first 
approach  is  that  it  requires  that  some  portion  of  the  client  DBMS  be  trusted.  This  is 
a  consequence  of  the  trusted  path  requirement  that  is  introduced  at  B2.  The  problem 
with  the  second  approach  is  that  it  requires  a  relatively  sophisticated  SDCI. 

•  DBMS  or  network  listener.  A  client-server  DBMS  ultimately  requires  that  there  be 
some  software  that  listens  to  a  well-known  address  on  the  network  and  responds  to 
client  requests.  In  a  process-per-user  server  architecture,  this  listener  is  generally 
a  separate  process  which  will  start  up  an  instance  of  the  server  on  demand.  In  a 
multithreaded  architecture,  the  listener  is  generally  a  standing  instance  of  the  server. 
It  is  desirable  that  this  function  be  performed  in  the  SDCI  for  two  reasons.  First,  the 
list('ner  must  be  multilevel*  and  therefore  would  increase  the  size  of  the  DBMS  TCB. 
Second,  the  listener  is  responsible  for  reliably  associating  a  user-id  with  a  request 
and  therefore  rcxpiircs  a  trusted  counterpart  on  the  client  platform.  If  this  function 
is  performed  by  the  DBMS,  then  there  must  also  be  trusted  DBMS  functionality  on 
the  client  platform.  Relying  on  the  SDCI  for  this  function  has  the  same  drawbacks 
mentioned  in  the  previous  two  items. 

•  Sin()le-level  or  multilevel  client  platforms.  It  would  be  desirable  to  support  an  archi¬ 
tecture  with  multilevel  servers  and  single-level  client  platforms.  The  problem  with 
such  an  architecture  is  that  the  client  platform  must  be  relied  upon  in  some  way  for 
identification  and  authentication  (except  in  the  case  of  a  single-level  component  ded¬ 
icated  to  a  single  user).  If  this  identification  is  to  be  used  for  MAC  purposes,  then 
the  client  platform  must  be  at  least  as  assured  as  the  server  (viz.,  B2).  One  possi¬ 
bility  is  to  use  single-level  client  i)latfornis  evaluated  at  the  C2  level  and  use  their 
I&A  only  for  DAC  purposes.  Architectures  such  as  this  can  use  the  identification  and 
authentication  inherent  in  communications  security  (COMSEC)  lines  connecting  the 
client  and  server  platforms  to  satisfy  the  mandatory  trusted  path  requirement.  This 
approach  is  consistent  with  the  ideas  underlying  the  Ti-usted  Network  Interpretation 
of  the  TCSEC  [8]. 

•  Homogeneous  or  heterogeneous  client  platforms.  It  is  preferable  for  a  client-server 
DBMS  to  run  on  heterogeneous  client  and  server  platforms.  The  problem  with  this 
is  that  the  recpiired  SDCI  is  not  widely  available  at  a  high  level  of  assurance.  The 
most  desirable  solution  to  this  problem  is  to  have  multiple  vendors  develop  support 
for  these  services  from  a  service  and  protocol  definition  that  has  been  adopted  as  an 
industry  standard.  This  is  being  done  in  the  CMW  world  with  the  MaxSix  architec¬ 
ture.  Another  possibility  is  to  have  a  third  party  provide  the  services  as  an  additional 

*It  is  possible  to  have  multiple  single-level  listener  processes,  but  this  would  require  having  one  active 
listener  for  each  point  in  the  lattice. 
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Figure  1:  Trusted  RUBIX  Client-Server  Architecture. 


trusted  layer  implemented  on  top  of  the  heterogeneous  platforms  as  is  done  in  ORA’a 
Theta  [9]. 

•  Single  or  multiple  administrative  domains.  It  is  also  desirable  for  a  client-server  DBMS 
to  allow  clients  and  servers  to  have  different  user-id,  group-id,  and  security  level  spaces. 
The  disadvantage  of  this  is  that  there  must  be  functionality  to  do  mapping  between 
these  attributes.  There  is  also  the  issue  of  who  does  the  attribute  mapping — the 
DBMS  or  the  SDCI.  This  is  an  important  decision  because  this  is  clearly  a  security 
relevant  function.  Having  a  single  administrative  domain  removes  this  problem,  but 
the  result  is  an  architecture  that  is  inflexible  and  not  scalable. 

The  Trusted  RUBIX  resolution  of  these  issues  will  be  covered  in  the  next  section. 


4  Trusted  RUBIX  Client-Server  Architecture 

The  system  architecture  for  Trusted  RUBIX  is  shown  in  Figure  1.  Clients  are  untrusted 
application  programs  that  have  been  linked  with  the  Trusted  RUBIX  client  software  so  that 
they  can  (xmimunicate  with  the  Tiusted  RUBIX  server.  There  is  one  instantiation  of  the 
server  for  each  active  client.  A  given  client  and  server  pair  can  run  on  the  same  machine  or 
on  different  machines  connected  on  a  network.  The  server  must  reside  on  the  same  machine 
as  the  data  to  be  accessed.  All  communication  between  client  and  server  takes  place  using 
the  SQL  Remote  Database  Access  standard  protocol  [10,  11].  A  client  can  access  multiple 
servers  concurrently  (although  no  mechanism  for  distributed  transaction  management  is 
provided).  Currently,  two  types  of  clients  are  supported:  the  standard  Interactive  SQL 
(ISQL)  interface  and  user-developed  Embedded  SQL  (ESQL)  applications. 

Our  design  approach  is  to  layer  this  architecture  onto  an  existing  SDCI  in  such  a  way 
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Figure  2:  SV4.1ES  Secure  Networking  Facilities 


that  no  additional  trusted  code  is  introduced.  To  make  this  possible,  the  SDCI  must  satisfy 
the  following  requirements: 

•  Re.motp  Proress  Invocation.  The  SDCI  must  provide  a  way  to  invoke  a  process  on  a 
remote  machine  at  the  invoker’s  current  level,  user-id,  and  group-id. 

•  Single-level  connection  e.stablishment.  The  SDCI  must  provide  a  way  to  establish  a 
single-level  connection  between  the  invoker  and  the  new  process. 

•  Network  .‘security  policy  enforcement.  The  SDCI  must  ensure  that  the  services  do  not 
violate  intra-  or  inter-host  mandatory  or  discretionary  access  controls. 

•  Da.ta  Confidentiality.  The  SDCI  must  protect  the  confidentiality  of  transmitted  data 
(this  can  be  accomplished  by  physically  protecting  the  transmission  media). 

•  Attrilnde  Mapping.  The  SDCI  must  perform  user-id,  group-id,  and  security  level 
mapping. 

The  SDCI  we  selected  for  the  implementation  of  Tiaisted  RUBIX  was  the  secure  network¬ 
ing  services  of  SV4.1ES  [12].  The  primary  components  of  the  SV4.1ES  secure  networking 
facilities  are  the  connection  server  and  listen  port  monitor  (Figure  2).  The  connection  server 
handles  all  client-side  connection  establishment  as  a  single  service.  The  listen  port  monitor 
is  a  daemon  that  listens  on  the  server  machine  for  incoming  connection  requests,  accepts 
the  recpiests.  and  starts  services  that  have  been  requested.  In  order  for  these  facilities  to 
be  used  to  j)rovide  a  service  (e.g.,  Trusted  RUBIX),  the  service  must  have  been  previously 
registered  with  the  connection  server  on  the  client  machine  and  the  listen  port  monitor  on 
the  server  machine. 

When  an  aj)plication  needs  to  access  a  service  on  a  remote  machine,  it  makes  a  request 
to  the  connection  server.  The  connection  server  validates  that  the  connection  to  the  remote 
machine  is  permitted  at  the  application’s  level.  If  the  connection  is  permitted,  the  con¬ 
nection  server  opens  a  connection  to  the  corresponding  listen  port  monitor  on  the  remote 
machine.  Once  the  connection  is  established,  the  connection  server  and  listen  port  monitor 
exercise  a  mutual  authentication  scheme  (a  cryptographic  scheme  based  on  secret  keys).  If 
the  authentication  succeeds,  the  connection  server  passes  the  client’s  user-id,  group-id,  and 
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security  level  to  the  port  monitor.  The  port  monitor  maps  the  user’s  identity  to  a  local 
user  identity  and  starts  the  server  corresponding  to  the  requested  service.  The  connection 
server  then  passes  its  end  of  the  connection  to  the  client,  and  the  port  monitor  passes  its 
end  of  the  connection  to  the  newly  invoked  server.  The  client  and  server  are  now  connected 
at  the  desired  security  level  and  user-id  and  can  begin  a  session. 

The  details  of  how  Trusted  RUBIX  utilizes  these  facilities  is  addressed  in  the  following 
sections. 

4.1  Detailed  Design 

We  describe  the  design  of  client-server  extensions  to  Trusted  RUBIX  in  terms  of  its  module 
structure,  internal  layering,  and  its  process  structure.  The  module  structure  is  a  decomposi¬ 
tion  of  the  system  into  modules.  Each  module  consists  of  a  set  of  closely  related  procedures. 
The  following  are  the  modules  related  to  the  client-server  portion  of  the  Trusted  RUBIX 
architecture  (the  structure  of  the  server  engine  itself  is  beyond  the  scope  of  this  paper). 

•  The  Interactive  SQL  (ISQL)  module  provides  an  interactive  SQL  interface.  This 
interface  can  be  used  to  submit  ad  hoc  queries  to  a  Trusted  RUBIX  server. 

•  The  Embedded  SQL  Preprocessor  (ESQLP)  module  is  a  preprocessor  that  takes  a  C 
program  with  embedded  SQL  statements  and  translates  it  into  an  C  program  that 
can  be  compiled  and  linked  with  Tiusted  RUBIX  client  software  to  aecess  the  server. 

•  An  ESQL  Application  (ESQLA)  module  is  an  application  module  generated  by  the 
ESQL  preprocessor. 

•  The  SQL  Client  Interface  (SQLCI)  module  provides  a  call-level  interface  that  can  be 
used  to  access  the  Tiusted  RUBIX  server.  There  are  two  versions  of  this  module,  a 
network  version  and  a  standalone  version.  The  network  version  uses  the  RDA  module 
to  access  the  server.  The  standalone  version  uses  the  RXIPC  interface  directly  and  is 
therefore  only  usable  when  client  and  server  are  on  the  same  machine. 

•  The  Remote  Database  Access  (RDA)  module  provides  low-level  services  that  allow  a 
client  to  start/terminate  a  remote  server,  manage  transactions,  execute  SQL  queries 
on  the  server,  and  retrieve  query  results  and  status  information.  This  module  utilizes 
the  SQL  Remote  Database  Access  protocol  to  support  client-server  communication. 
This  module  supports  both  a  client-side  and  server-side  interface. 

•  The  Secure  Networking  Facilities  (SNF)  module  provides  services  required  for  invoking 
the  server  on  a  host  and  establishing  a  secure  communication  channel  between  client 
and  server.  This  module  supports  both  a  client-side  and  server-side  interface.  This 
module  is  not  trusted  in  itself,  but  hides  the  details  of  the  underlying  secure  networking 
facilities.  When  these  facilities  change,  only  this  module  needs  to  be  modified. 

•  The  Server  Driver  (SD)  module  is  the  server  side  controller  for  Trusted  RUBIX.  It 
accepts  requests  over  the  network  connection  (via  RDA)  and  executes  them  on  the 
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Figure  3:  Ti-asted  RUBIX  Software  Architecture 


Server  Engine  (via  the  interface  provided  by  RXIPC).  Query  results  and  status  infor¬ 
mation  are  then  returned  to  the  client. 

•  The  RUBIX  Interprocess  Communication  (RXIPC)  module  hides  the  details  of  the 
IPC  interface  to  the  Trusted  RUBIX  engine.  The  Trusted  RUBIX  engine  runs  isolated 
in  a  separate  address  space  (as  described  in  SectioTi  2)  and  is  accessible  only  through 
this  interface.  Functions  on  the  interface  of  this  module  implement  a  form  of  remote 
procedure  call  by  translating  calls  into  IPC  requests  to  the  rxserver  process  (which  is 
also  encapsulated  within  this  module). 

•  Server  Interface  (SI)  module  provides  a  function  call  interface  to  the  server.  This  can 
only  be  accessed  through  the  RXIPC  module  since  direct  linking  with  this  code  would 
result  in  a  uon-isolated  TCB. 

It  is  important  to  note  that  of  the  above  modules,  only  the  RXIPC  and  SI  are  within 
the  RUBIX  TCB;  and  these  modules  are  required  even  in  the  standalone  architecture.  That 
is,  no  new  trusted  code  was  introduced  to  support  client-server  operation. 

The  layering  of  the  Trusted  RUBIX  client-server  modules  is  shown  in  Figure  3.  The 
shaded  part  of  the  figure  indicates  modules  in  the  Timsted  RUBIX  TCB.  The  RXIPC  module 
is  unique  in  that  is  consists  of  a  trusted  and  an  uutrusted  part.  These  parts  correspond  to 
the  two  ends  of  the  IPC  connection.  Note  that  the  “Server  Engine”  module  in  the  figure 
actually  consists  of  a  large  number  of  modules  (including  the  Server  Interface  module)  which 
are  not  shown  in  this  diagram.  A  discussion  of  these  modules  is  beyond  the  scope  of  this 
paper. 

The  process  structure  is  a  decomposition  of  the  run  time  activities  of  the  system  into 
processes.  Trusted  RUBIX  actually  has  two  process  structures  corresponding  to  the  two 
possible  builds  of  the  system  (i.e..  the  standalone  process  structure  and  the  client-server 
process  structure).  In  the  following,  we  will  focus  on  the  client-server  process  structure. 

As  shown  in  Figure  4,  the  proce.ss  structure  for  'Uusted  RUBIX  consists  of  three  pro¬ 
cesses.  There  is  the  application  process  which  consists  of  an  application  linked  with  the 
Trusted  RUBIX  client  code.  The  application  (van  be  either  the  ISQL  application  supplied 
with  Trusted  RUBIX  or  an  ap})lic,ation  a  user  developed  using  ESQL.  This  process  runs  on 
behalf  of  the  user  and  has  no  special  privileges. 
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Figure  4:  Trusted  RUBIX  Process  Structure 

The  server  driver  process  runs  on  the  server  and  reads  database  requests  from  the 
application  process  (using  the  RDA  protocol)  and  sends  those  requests  to  the  rxserver 
process  using  the  RXIPC  interface.  This  process  is  invoked  as  the  result  of  a  remote 
process  invocation  request  by  the  client  (this  functionality  is  supplied  through  SNF).  The 
process  is  invoked  at  the  client’s  level,  user-id,  and  group-id,  and  runs  with  no  special 
privilege.  As  part  of  the  invocation  process,  this  process  obtains  a  single  level  connection 
to  the  application  process. 

The  rxserver  process  forms  the  Trusted  RUBIX  TCB  (this  is  indicated  by  the  shading 
in  Figure  4).  It  reads  database  requests  from  the  server  driver  and  executes  them  against 
the  requested  database.  The  rxserver  process  performs  all  the  access  mediation  functions  of 
Trusted  RUBIX.  It  is  started  by  the  server  driver  when  the  server  driver  receives  a  request 
to  open  a  database.  Rxserver  is  a  timsted  subject  that  runs  under  the  user’s  user-id  (the 
group-id  of  this  process  is  set  to  RUBIXTP  so  that  it  can  access  protected  data). 

4.2  Operational  Scenario 

The  operational  characteristics  of  this  architecture  are  best  illustrated  with  a  scenario.  A 
user  starts  a  session  by  logging  in  to  a  client  machine  at  a  specific  level.  Since  the  client 
machine  is  a  B2  platform,  it  identifies  and  authenticates  the  user  via  a  trusted  path.  Once 
authenticated,  the  user  can  invoke  a  Trusted  RUBIX  application  on  the  client  machine. 
When  the  application  requests  a  connection  to  the  server  (e.g.,  through  an  SQL  CONNECT 
statement),  the  request  is  ultimately  translated  into  request  to  the  SV4.1ES  networking 
facilities  to  access  a  remote  service  (viz.,  the  Trusted  RUBIX  server).  The  networking 
facilities  first  validate  that  the  requested  invocation  does  not  violate  network  security  policy. 
If  the  operation  is  permitted,  the  networking  facilities  invoke  a  Trusted  RUBIX  server  driver 
process  on  the  remote  host  at  the  application’s  security  level  and  using  the  corresponding 
user’s  user-id,  and  group-id  (possibly  mapped).  The  final  step  in  the  invocation  process  is 
to  provide  the  client  application  and  server  driver  with  a  single-level  connection  over  which 
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database  access  requests  and  results  can  be  passed. 

The  server  driver  is  the  client’s  agent  on  the  server  machine.  When  the  client  requests 
access  to  a  database,  the  server  driver  invokes  the  rxserver  process.  This  process  inherits 
the  security  level  and  user  attributes  of  the  server  driver.  An  interprocess  communication 
channel  is  set  up  between  the  server  driver  and  the  rxserver  process  for  the  communication 
of  requests  and  results. 

The  client  sends  queries  to  the  server  (server  driver)  using  the  RDA  protocol.  These 
queries  are  forwarded  to  the  rxserver  process  which  mediates  access  based  on  the  user's 
level  and  user-id.  Since  the  server  driver  is  untrusted,  the  rxserver  process  (i.e.  the  Trusted 
RUBIX  TCB)  labels  all  data  coming  from  the  server  driver  at  that  level,  and  mediates 
all  database  accesses  based  on  that  level.  Since  the  rxserver  process  was  started  under 
the  user’s  user-id  (possibly  mapped),  it  has  a  basis  for  making  discretionary  access  control 
decisions.  Results  are  sent  from  the  rxserver  process  to  the  server  driver  and  back  to  the 
client  a])plication.  When  the  client  terminates  its  connection,  the  server  driver  notifies  the 
rxserver  process  to  terminate  and  then  terminates  itself. 

In  this  section,  we  presented  a  design  for  a  multilevel  secure  client-server  DBMS.  Our 
basic  design  approach  was  to  layer  the  DBMS  client  and  server  onto  an  existing  secure  dis¬ 
tributed  computing  infrastructure.  This  approach  allowed  us  to  transition  Trusted  RUBIX 
to  a  (dient-server  architecture  without  the  introduction  of  any  new  trusted  code.  Another 
notable  aspect  of  our  design  is  that  all  DBMS  policy  enforcement  functions  are  centralized 
on  the  server  (i.e.,  not  distributed).  This  is  important  because  it  makes  it  easier  to  reason 
about  the  correctness  of  the  mechanism  enforcing  the  system  security  policy. 

5  Assurance  in  Client-Server  Architectures 

Assurance  is  concerned  with  providing  convincing  and  rigorous  technical  arguments  that 
the  security  mechanisms  in  a  secure  system  are  implemented  correctly.  The  assurance 
arguments  for  distributed  systems  are  inherently  more  complex  than  the  corresponding 
arguments  for  centralized  systems.  This  section  discusses  the  relationship  between  certain 
critical  client-server  DBMS  design  choices  and  assurance.  In  this  discussion,  we  will  limit 
ourselves  to  architectures  where  the  DBMS  code  is  layered  on  top  of  an  existing  distributed 
computing  infrastructure  such  as  in  the  design  i)resented  in  the  previous  section. 

To  build  a  client-server  DBMS  on  top  of  a  secure  distributed  computing  infrastructure 
is  to  build  on  top  of  a  distributed  TCB  where  eadi  host  platform  (client  or  server)  forms  a 
partition  of  that  TCB.  We  will  use  the  terminology  from  the  Trusted  Network  Interpretation 
of  the  TCSEC  [8]  and  refer  to  these  partitions  as  Network  TCB  (NTCB)  partitions.  Figure 
5  illustrates  the  layering  of  the  DBMS  client  and  S(;rver  on  top  of  the  distributed  TCB. 
The  degree  of  assurance  that  (\an  be  attained  with  such  an  architecture  depends  heavily 
on  the  characteristics  of  the  client  and  server  components  of  the  DBMS  software.  To  aid 
ill  our  analysis,  we  categorize  client  and  server  components  based  on  their  policy- related 
responsibilities.  Our  two  categories  are:  support  for  DBMS  mandatory  access  control  and 
support  for  DBMS  discretionary  access  control.'^  A  given  client-server  architecture  can  now 

^The  notion  of  supporting  policies  (viz.,  identification,  authentication,  and  audit)  is  rolled  into  these 


DBMS  Client 

DBMS  Server 

NTCB  Partition 

^  * 

NTCB  Partition 

Figure  5;  Client-Server  Distributed  TCB 


be  characterized  by  assigning  zero  or  more  of  the  categories  to  the  client  and  to  the  server. 
The  result  is  a  set  of  16  possible  assignments  (4  possible  combinations  for  client  and  server).^ 

Certain  (overlapping)  classes  of  the  above  assignments  are  of  particular  interest  from  an 
assurance  point  of  view  and  are  discussed  below. 

•  No  DBMS  policy  enforcing  software  in  the  client  or  server.  This  would  be  the  case 
in  an  architecture  that  relied  completely  on  the  underlying  distributed  TCB  for  its 
security  (similar  to  a  Hinke/Scheafer  approach  [13]  on  a  distributed  TCB).  This  class 
offers  the  highest  level  of  assurance  since  the  DBMS  components  enforce  no  security 
policy  in  themselves  and  run  with  no  special  privilege  with  respect  to  the  underlying 
distributed  TCB.  Their  security  characteristics  depend  completely  on  the  security 
characteristics  of  the  underlying  distributed  TCB. 

•  Discretionary  policy  enforcing  software  in  the  server  only.  This  would  be  the  case  in 
an  architecture  that  relied  on  an  underlying  distributed  TCB  for  MAC  enforcement, 
but  implemented  its  own  DAC  on  top  of  the  DAC  policy  of  the  underlying  TCB 
(similar  to  a  SeaView  [14]  approach  on  a  distributed  TCB).  This  case  would  have  a 
high  degree  of  assurance  for  MAC  because  the  DBMS  runs  with  no  special  privilege 
with  respect  to  the  underlying  distributed  TCB.  The  level  of  assurance  obtainable  for 
DAC  would  be  commensurate  with  the  assurance  techniques  applied  to  the  DBMS 
server.  The  important  point  here  is  that  effort  involved  in  obtaining  a  given  degree 
of  assurance  for  this  architecture  would  be  no  more  than  that  required  for  a  similar 
standalone  architecture.  This  is  because  the  DAC  policy  enforcement  mechanisms  are 
centralized  on  the  server. 

•  Mandatory  policy  enforcing  software  in  the  server.  This  would  be  the  case  in  a  trusted 
subject  DBMS  architecture  where  part  of  the  server  runs  as  a  trusted  subject  with 
respect  to  the  underlying  distributed  TCB.  This  is  the  Trusted  RUBIX  case.  In  this 
case,  more  analysis  is  required  to  gain  a  high  level  of  assurance  because  the  DBMS 
server  runs  with  special  privilege  with  respect  to  the  underlying  distributed  TCB. 
The  difficulty  that  arises  is  that  the  DBMS  server  constitutes  a  modification  to  the 
underlying  TCB,  and  therefore,  any  assurance  arguments  must  not  only  consider  the 
DBMS  TCB  itself,  but  also  its  impact  and  interactions  with  the  underlying  distributed 
TCB.  This  is  similar  to  a  single  machine  case,  where  the  assurance  arguments  for  a 
trusted  subject  DBMS  must  extend  to  the  underlying  operating  system.  It  needs 

categories  since  these  policies  can  be  supportive  of  MAC,  DAC,  or  both.  That  is,  if  a  component  implements 
supporting  policies  for  MAC,  we  consider  it  a  MAC  component. 

®A  more  detailed  analysis  could  be  done  using  the  categories  presented  in  Appendix  A  of  the  TNI.  The 
four  categories  presented  there  (M,D,I,  and  A)  would  yield  a  total  of  256  combinations. 
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to  be  determined  if  these  impacts  can  be  isolated  to  the  NTCB  partition,  or  if  the 
assurance  arguments  would  need  to  extend  to  the  system  level.  This  architecture 
retains  the  advantage  that  the  DBMS  policy  enforcement  mechanisms  are  centralized 
on  the  server. 

•  Policy  enforcing  software  in  the  client  and  server.  In  this  case,  the  assurance  argument 
may  or  may  not  have  to  extend  into  the  distributed  TCB  depending  on  the  approach 
to  MAC  enforcement  (i.e.,  TCB  subset  or  trusted  subject).  But  since  the  DBMS 
policy  enforcement  responsibilities  are  distributed,  the  assurance  argument  for  the 
DBMS  TCB  will  be  much  more  complex  than  in  the  case  when  the  DBMS  TCB  is 
centralized. 


This  discussion  covered  only  a  few  of  the  possible  architectures.  The  important  point  to 
remember  is  that  the  complexity  of  the  assurance  argument  for  a  client-server  DBMS  archi¬ 
tecture  (which  is  directly  related  to  evaluation  difficulty)  seems  to  be  even  more  sensitive 
to  architectural  choices  than  are  assurance  arguments  for  standalone  DBMS  architectures. 

6  Conclusions  and  Future  Research 

In  this  paper,  we  presented  a  design  for  a  multilevel  secure  client-server  DBMS  intended 
to  satisfy  the  TCSEC  requirements  for  a  Class  B2  computer  system.  We  also  presented  an 
analysis  of  the  relationship  between  client-server  design  choices  and  assurance.  The  major 
conclusions  from  this  effort  are: 

•  It  is  possible  to  develop  a  multilevel  secure  client-server  DBMS  without  introducing 
additional  trusted  code  over  that  used  in  the  server  engine.  This  can  be  accomplished 
by  building  on  an  existing  secure  distributed  computing  infrastructure.  This  general 
approach  can  bo  used  with  servers  having  trusted  subject  or  TCB  subset  architectures. 

•  The  complexity  of  the  assurance  argument  for  a  client-server  DBMS  (and  therefore 
the  evaluation  difficulty)  is  dependent  primarily  on  two  factors;  whether  the  client  or 
server  require  trusted  subjects,  and  whether  the  policy  enforcement  functions  of  the 
DBMS  are  distributed  between  client  and  server. 

•  The  enabling  technology  for  a  multilevel  secure  client-server  DBMS  developed  using 
this  approach  is  not  yet  mature.  Some  operating  system  vendors  (e.g.,  UNIX  System 
Laboratories)  provide  the  necessary  functionality,  but  the  functionality  is  not  currently 
in  their  evaluated  configuration. 

There  are  a  number  of  directions  in  which  this  work  could  be  extended.  One  promising 
area  is  to  investigate  how  this  architecture  can  be  used  with  single-level  client  machines  as 
discussed  in  section  3.  Another  possible  extension  of  this  work  is  to  allow  clients  to  execute 
transactions  that  span  multiple  servers.  If  clients  are  assumed  to  be  single-level,  then  they 
may  only  connect  to  servers  at  that  single  level.  Since  the  connections  are  single  level, 
and  the  servers  to  which  they  are  connected  are  multilevel  trusted,  this  class  of  transaction 
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would  not  appear  to  introduce  any  significant  covert  channels  [15].  This  of  course  assumes 
that  the  concurrency  control  mechanisms  of  the  underlying  servers  are  secure. 

A  more  ambitious  extension  of  this  work  is  to  modify  the  server  to  support  a  true  dis¬ 
tributed  database  capability.  Issues  that  could  be  investigated  in  such  an  effort  include 
secure  distributed  transaction  management,  secure  distributed  query  processing,  secure  ac¬ 
cess  to  heterogeneous  databases,  security  implications  of  data  replication,  secure  distributed 
database  infrastructure,  and  impacts  of  security  on  Open  Systems  standards. 
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Abstract 

i  in',  napi';  iesrrii'i  s  a  powerfni.  yet  practicai.  fortiiaiisni  for  modeling  and  controlling  imprecise  FD-based 
infprrnc:  -  'it  ip|atioiia.l  database  systems  The  formalism  provides  a,  canonical  representation  of  inference 
'.vhich  aniti'  p  pipcisf  itifprence  and  the  primitive  imprrcisc  inferenc!  mechanisms  of  abduction  and  partial 
lediip.ttoii  VVliereas  nthe:-  imprecise  (partial)  infereiire  models  estimate  the  probability  of  making  inferences, 
the  formalism  supports  tiie  analysis  of  the  actual  imprectse  values  inferred  in  a  database  extension.  Imprecise 
inferenre  is  arialyrrd  bv  transforming  a  precise  database  augmented  with  additional  “catalytic”  relations, 
(onvpvinc  possiblv  impreci.se  a  prtor?  knowledge,  into  an  equivalent  imprecise  database.  The  analysis  of 
impreci.si  mfereiice  and  the  related  inference  control  methodology  are  highly  flexible  and  robust.  They  can 
be  directi"  applied  to  riassical,  MbS.  and  imprecise  databases.  With  minimal  modifications,  they  also  can  be 
iised  in  knowledge  fijscovery  or  database  mining. 


1  Introduction 

(  oiitr'dling  US',  r  .ircess  P  daialiasc  data  has  lieen  the  fneiis  of  considerable  attention  in  the  security  community. 
However,  serious  serunty  compromises  also  l  an  .arise  from  inference  attacks  [2].  An  inference  attack  occurs  in  a 
multilevel  serup'  idVlLS)  database  [11]  when  a  low  user  is  able  to  infer  sensitive  information  from  common  knowl¬ 
edge  and  authonyivl  ipiery  responses.  Sii  and  Ositnyogln  |  111, 201  showed  that  integrity  constraints,  particularly 
FDs  and  MVDs.  jiose  serious  .security  threats  Thev  devised  an  algorithm  for  optimally  upgrading  information 
to  eliminate  inference  compromises;  they  also  showed  that  this  problem  is  Nf^-complete.  More  recently,  Lunt 
and  colleagues  at  .SHi  International  [h.lu  18]  have  dcvelopeil  an  interai’tive  tool,  DISSEf.lT.  for  deflecting  and 
eliimna!  iiig  rompositii  iii.al  inference  channels  due  b'  foreign  kev  FDs.  The  DISSECT  mode!  builds  on  earlier  work 
■  ui  iiiferenee  control  uidiiding  tools  and  lechniques  developed  by  Riirzkowski  [1],  Thnraisingham  [2h,24],  and 
Hinki  l9j  File  cnrreid  version  of  DlSSEf  '  I  [18]  is  limited  to  .analyzing  MLS  database  schemas  (intensions)  rather 
than  actual  M!,S  relations  (extensions)  Mevert lieiess.  its  success  shows  that  it  is  possible  to  develop  practical 
t.ools  for  dealing  with  the  difficult  problem  of  inferenre  control 

Fffectiveiy  coiiirolling  imprecise  inference  m  MLS  datahtises  remains  a  relatively  unexplored  problem  of  great 
importance  .An  mi|irecise  or  partial  inferrno  compromise  occurs  when  a  low  user  is  able  to  infer  an  exact  value 
or  a  set  of  po.ssiMe  values  -  an  informat.ion  chunk  --  for  a  .sensitive  attribute  with  a  certain  probability  The 
granularity  .  f  I  In  mferred  chunk  may  In’  small  enough  and/or  its  probability  high  enough  to  constitute  a  security 
breach.  Nf'l  onlv  is  imprecisi  inference  |us!  as  damaging  as  precise  inference,  it  is  also  far  more  prevalent. 

Several  researcher.s  have-  recognized  I  hr  import  anre  of  controlling  imprecise  or  partial  inference  in  MLS 
iatabases  Su  and  Oszoyogin  [19,20]  noted  t.hat  coninroniise  eliiTiination  techniques  involving  FDs  and  MVDs 

*T<'  whnifi  ''('trrpsj'ioiiriffiK'P  slioiiM  be  addressed. 

Research  ■'iippiArted  in'  (iraut  IRI-Oll 0700  and  C)f'\4ST  '  irant  AR'2-002- 
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-ould  i-'p  extended  to  consider  ranges  of  values.  Morgenstern  [13,14]  was  among  the  earliest  researchers  to  for¬ 
mally  investigate  imprecise  inference.  He  viewed  inference  as  a  means  to  “localize  the  space  of  possible  values.’ 
A  unique  feature  of  his  approach  is  its  use  of  constraint  expressions  involving  “spheres  of  influence”  to  capture 
logical  inference  and  an  information-theoretic  (entropy)  measure  for  imprecise  information.  The  inference  control 
tool  developed  by  Buczkowski  [1]  uses  Bayesian  probability  to  estimate  security  risks  due  to  imprecise  inference; 
his  model  extends  Morgenstern’s  framework  to  permit  the  propagation  of  imprecise  inference.  Garvey  and  bunt 
14.6,10]  have  characterized  inference  channels,  including  partial  inference  channels  due  to  abduction  and  proba¬ 
bilistic  reasoning.  .Although  the  current  version  of  their  DISSECT  tool  is  geared  for  precise  inference,  plans  are 
underway  to  extend  it  to  partial  inference  control  [15,18]. 

This  paper  describes  a  powerful,  yet  practical,  formalism  for  modeling  and  controlling  FD-based  imprecise 
inference  in  relational  databases.  The  formalism  uses  scalar  domain  partitions  —  which  we  call  “contexts”  — 
to  model  imprecise  inference  and  to  express  integrity  constraints  [3,16,17].  Contexts  also  provide  a  canonical 
representation  which  unifies  precise  inference  and  the  primitive  imprecise  inference  mechanisms  of  abduction 
and  partial  deduction  [7,21,22],  Morgenstern’s  sphere  of  influence  notion  for  inferred  information  [13,14]  has 
ni.otivatpd  the  development  of  our  formalism.  Indeed,  the  equivalence  classes  contained  in  contexts  can  be  viewed 
as  specifying  the  “maximal”  spheres  of  influence  for  data  involved  in  imprecise  inference.  Whereas  other  techniques 
je.fi.  il.4.6,10])  estimate  the  probability  with  which  sensitive  values  are  inferred,  the  context-based  formalism 
enables  us  to  consider  the  actual  “information  chunks’’  inferred  in  a  database  extension.  To  distinguish  it  from 
probability-based  models,  we  use  the  term  “imprecise”  rather  than  “partial”  to  characterize  the  inference  model. 

The  imprecise  inference  model  is  highly  flexible  and  robust.  Imprecise  inference  is  analyzed  by  transforming 
a  jirerisc  database,  possibly  augmented  with  additional  “catalytic”  relations  [8]  conveying  a  prion  knowledge 
available  to  low  users,  into  an  equivalent  imprecise  database.  This  technique,  which  is  similar  to  those  used  in 
deductive  databases,  is  [larticularly  suited  to  modeling  the  propagation  of  compromising  imprecise  inferences  in 
MLS  databases  Furthermore,  the  canonical  representation  of  imprecise  inference  enables  the  inference  analysis 
methodology  to  be  directly  applied  to  classical,  MLS,  or  imprecise  databases.  With  minimal  modifications,  it  also 
can  be  used  in  knowledge  discovery  or  database  mining. 

This  paper  is  organized  into  six  mam  sections.  Following  these  introductory  remarks.  Section  2  examines 
the  primitive  FD-based  inference  mechanisms  of  abduction  and  partial  deduction  and  provides  a  foundation  for 
their  unification.  Section  3  formally  defines  the  context  model,  an  imprecise  data  model  representing  our  view 
of  imprecise  inference.  The  key  concepts  of  an  imprecise  FD  and  an  imprecise  inference  channel  are  presented  in 
Section  4.  Section  5  describes  the  related  imprecise  inference  control  methodology.  The  conclusions,  relevance 
and  applications  of  the  imprecise  inference  analysis  and  control  methodologies  are  summarized  in  Section  6. 


2  Imprecise  Inference 

In  general,  an  FD-based  inference  may  be  deductive,  involving  forward  reasoning  with  an  FD,  or  abductive, 
involving  backward  reasoning.  Furthermore,  FD-based  inference  may  be  precise  or  imprecise,  depending  on  the 
nature  of  the  FD  and  the  data  used  in  inference.  An  imprecise  inference  compromise  occurs  when  a  user  can 
infer  a  range  of  values  for  a  sensitive  attribute  using  other  information  in  the  database.  The  granularity  of  the 
inferred  information  could  be  fine  enough  to  constitute  a  security  threat. 

Abduct, ion  and  partial  deduction  are  the  two  primary  mechanisms  for  imprecise  inference.  This  section 
describes  the  two  mechanisms  and  shows  how  they  can  be  unified  using  the  notion  of  an  induced  set  FD,  a  special 
kind  of  impreci.se  FD.  The  unification  yields  an  elegant  definition  of  an  imprecise  inference  channel  and  greatly 
simplifies  the  related  inference  analysis. 

Note  that  the  definitions  and  illustrative  examples  in  this  section  and  in  the  remainder  of  this  paper  use 
classical  relations  to  simplify  the  presentation.  The  extension  to  MLS  relational  databases  —  even  those  containing 
polymstantiated  data  —  is  accomplished  by  imposing  the  mandatory  security  constraints  on  data  classifications 
and  user  clearance  levels.  Views  of  MLS  relations  created  at  specific  user  clearance  levels  are  individually  analyzed 
for  potentially  compromising  imprecise  inferences. 
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2.1  Abduction 


The  exist.f'iir.e  of  an  FD:  X  Y  implies  that  a  precise  value  for  X  determines  a  unique  (and  precise)  value  F'r 
V  it  IS  also  possible  to  use  the  FD  —  actually  its  manifestation  in  a  relation  —  to  reason  backwards,  i,e.,  to 
determine  a  value  for  X  given  a  precise  (or  imprecise)  value  for  V  If  the  FD  corresponds  to  a  many-to  oiie 
function,  then  the  inferred  value  for  X  is  imprecise.  This  mechanism  is  commonly  referred  to  as  abduction.  For 
example,  in  the  relation  below,  given  the  salary  20/\  and  the  fact  that  the  FD:  Name  Salary  exists,  it  is 
possible  to  determine  a  set  of  names,  {Bill,  Bob},  that  map  to  the  salary  value.  Note  that  it  is  not  nece.ssary  for 
a  user  to  know  the  complete  FD  mapping.  Information  is  abduced  using  only  the  portion  of  the  FD  nianifestei! 
m  the  relational  extension. 


N  amx 

Salary 

Bill 

20  K 

Boh 

20  A' 

Joe 

25 /f 

Jill 

35/1 

lo  under.stand  how  a.luluction  can  lead  to  a  potential  security  compromise,  consider  the  relation  ahove,  wiiere 
salary  information  is  sensitive  to  the  point,  that  low  users  should  not  obtain  the  salary  of  any  employee  to  any 
degree  of  i.ireci.sion.  Now  assume  that  the  two  relations  below  are  accessible  to  low  users  (e.g.,  in  a  poorly-designed 
data.base  view  i 


N  amc 

Tax 

Rill 

10% 

Bob 

10% 

J  oe 

10% 

Jill 

15% 

Salary 

Tax 

2QK 

10% 

25  K 

10% 

35 /f 

15% 

F’rovided  with  acce.ss  to  the  relations  R{Nnmr.  Fax)  and  R.{Salnry,  Tax)  above,  a  low  user  could  infer  the 
following  relation  In  abduction. 


Name 

Salary 

Bill 

■!20/f.25A'} 

Boh 

{20A.25A'} 

J  Of 

120/1.25  A') 

Jill 

{35  A’} 

Note  tlia!  .  alt.liniigh  the  information  ill  the  abduced  relation  is  imprecise,  a  security  compromise  exists  because 
the  low  level  usi'r  has  a.  rea.sonably  good  idea  of  how  much  each  employee  earns.  The  solution,  of  course,  is  to 
classify  t.lu'  Tn.v  attribute  in  R{Na7ne,  I'ax)  as  sensitive  so  that  the  abductive  channel  is  closed. 


2.2  Partial  Deduction 


f’artial  deduction  generalizes  precise  deductive  inference.  Given  an  FD:  Xi...Xn  Y,  partial  deduction  occurs 
when  values  for  a  subset  of  left-hand  side  attributes  are  used  to  determine  a  right-hand  side  attribute  value. 
Partial  deduction  using  a  proper  subset  of  A'  attributes  yields  an  imprecise  Y  value. 
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Job 

Experience 

Salary 

T  echnician 

Little 

15  A' 

T  echnician 

Lots 

20^ 

Scientist 

Little 

20K 

Scientist 

Lots 

25A' 

M anager 

Little 

25  A' 

Manager 

Lots 

35A' 

lo  understand  partial  fleduction,  assume  that  the  FD;  Job,  Experience  — >  Salary  holds  in  the  relation  above, 
projecting  out  the  Experience  attribute  yields  the  following  relation. 


Job 

Salary 

Technician 

15A 

T  echnician 

20  K 

Scientist 

20  K 

Scientist 

25A' 

M anager 

25  A' 

M anager 

35A' 

iiitoriiiatnui  m  tlir  projected  relation  above  .-an  lie  used  with  the  FD:  Job,  Experience  — >■  Salary  in  partial 
deductive  inference  This  gives  rise  to  the  relation  pre.sented  below.  Note  that  knowledge  of  a  Job  value  allows  the 
inference  of  an  approximate  salary,  e.g.  {  1 5  A'  20 /V’ )  for  a  Technician,  which  may  be  potentially  compromising, 


Job 

Salary 

T  echnician 

{15A',20A'} 

Scientist 

{20A',25A} 

M anager 

{25A',.35A'} 

2.3  Unifying  Abduction  and  Partial  Deduction 

The  abduction  exaniiile  .shows  how  a  low  user  can  infer  a  range  or  set  of  possible  salaries  given  values  for  the 
right-hand  .side  of  the  FD.  Likewise,  tin  partial  deduction  example  .shows  how  a  range  of  possible  salaries  may 
be  inferred  from  a  job  description,  i.e  .  a  proper  subset  of  the  left-hand  side  attributes.  In  both  examples, 
precise  values  are  used  to  produce  imprecise  inferences.  In  general,  however,  inference  can  initiate  with  precise  or 
imprecise  values:  using  imprecise  information  gives  rise  to  inferred  information  which  is  correspondingly  imprecise. 
Th(>  ability  to  infer  precise  or  imprecise  information  by  abduction  and  partial  deduction  stems  from  “induced  set 
functions  generated  from  FD  mappings  manifested  in  database  relations  (extensions)  [7,21], 

Induced  set  functions  for  some  function  /  :  A'  . V  define  the  image  of  each  A  C  dom(X)  and  the  inver.'ic 

image  of  each  B  C  rimni)').  The  induced  set  fiincl.ion  concept  allows  us  to  define  induced  FDs  which  unify 
abduction  and  partial  deduction. 

Definition:  Let  /  :  ,\  — ^  Y  denote  an  FD,  then  F  :  X  —*  Y  is  the  induced  FD  of  /  such  that  the  image  of  A  C 
donuX)  IS  F{A)  =  {y  t  dom{Y)  :  y  —  f{x)  for  some  x  e  A). 


As  explained  above,  partial  deduction  uses  values  for  a  subset  of  the  left-hand  side  attributes  in  an  FD:  X  — + 
V.  Let  us  denote  the  subset  of  attributes  and  the  tuple  component  values  corresponding  to  these  attributes  by 
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i  ■  fuifi  .r,  rpsperi  ively  Partial  deduction  uses  the  x  value  to  derive  a  value  A  —  {<[^]  :  =  x.}.  The  image 

it  1  IS  3,  set  of  tiu'iip  components  in  Y  which  corresponds  to  the  imprecise  information  that  can  be  inferred  from 
r  by  partial  deduction 

Definition:  Let  /  .V  Y  denote  an  FD  then  :  V  X  is  the  inverse  induced  FD  of  /  such  that  the 
invrrvf  tmaqc  of  B  C  dorn{Y)  is  F~^(B)  ~  {x  e  dom{X)  .  fix]  <■  B}. 

Abdin  tion  uses  values  for  the  right-hand  side  attribute  in  an  FD:  X  —*  Y .  (Note  that  it  is  necessary  only  tc! 
onsider  FDs  with  single  attributes  on  their  right-hand  sides. '}  Let  us  denote  the  tuple  component  in  as  y  and 
iet,  B  =  {.Vi  The  inverse  image  of  B  is  the  information  that  ran  be.  inferred  from  y  by  abduction. 

Tile  two  definitions  above  show'  that  an  induced  FD:  X.  — i-  T  is  a  function  mapping  sets  of  X  tuple  components 
!  sul'sets  of  dnmi  V  !  i  to  sets  of  Y  tuple  components  (subsets  of  d.om{Y)).  This  mapping  is  induced  from  a  relational 
''vf.etisinn  induced  FDs  and  inverse  induced  FDs  can  be  treated  in  a  uniform  manner.  For  this  reason,  we  refer  to 
t  hem  coliectivel}.  as  mduced  FDs.  This  notion  unifies  the  primitive  imprecise  inference  inechanisnis  of  abduction 
and  partial  dedurfion. 

In  general,  ,a.ri  nnprrcist  FD  maps  set  values  to  set  values  (A  set  and  its  subsets  are  considered  to  be  equivalent 
when  determining  the  functional  relationship.)  'I'he  induced  .FD  for  a  given  relation  is  the  “minimal”  impreri.sc 
FD  generated  in  the  reiatioii.  To  understand  the  i.listinctiou  between  the  two  concepts  consider  the  following 
lirecise  relation. 


V 

a 

1 

2  i 

h 

b= 

2  ! 

The  FT)'  .\  V  does  not  hold  in  the  above  relation.  Mowever,  it  is  possible  to  “induce”  an  FD  from  X  to  Y 
liv  “merging”  precise  iiiiile.s  The  induced  FD:  A'  F  holds  in  the  imprecise  relation  below  (left).  It  is  obtained 
by  merging  the  fewest  tuples  in  the  precise  relation  so  that  a  functional  relationship  exists  between  set  values. 


.V  ’ 

r-  Y" 

i . . . 

{a| 

{1,2} 

{!>] 

{2} 

X 

Y 

{«} 

{1.2} 

{b] 

{1,2} 

.4  Toarser'  imprecise  FT):  A'  — *  V'  holds  in  tlie  relation  on  the  right.  The  imprecise  relation  corresponding 
to  this  imiirecise  FD  is  cihtained  by  merging  ami  “coarsening’'  (or  “clouding”)  tuples  to  a  greater  degree  than 
is  required  l.o  prodiicr  tlie  induced  FD.  h“nr  any  relational  extension  exactly  one  induced  FD  can  be  generated 
between  any  two  set.s  of  attributes;  however,  iiimierons  mipreci.se  FDs  can  be  made  to  hold  by  appropriately 
clouding  tlie  extension  'Die  induced  F"l)  and  imprecise  FD  concepts  will  he  formalized  in  Section  4  using  thr- 
notioii  of  a  "context'  defined  below. 


3  Contexts  and  Imprecise  Relations 

Imprecise  inferencf'  analysis  involves  the  traiisformal ion  of  a  (irecise  database  to  an  imprecise  database  to  nia 
tenalize  potentially  com|)ronusing  imprecise  inferences.  Integrity  constraints  must  lie  defined  for  the  relation.'; 
containing  impren.se  data  (set-valued  tuple  components).  This  section  uses  the  idea  of  a  context  to  define,  domain 
and  entity  integrity  constraints  for  imprecise  relatmn.s  [3,10,17].  The  context-based  integrity  constraints  defined 
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ui  rhis  sei’iion  generalize  their  classical  counterparts  The  notion  of  a  context  will  be  used  in  later  sections  to 
formalize  imprecise  FDs  and  imprecise  inference  compromise,  which  also  generalize  their  precise  counterparts. 


3.1  Contexts 

\  context  I  '  IS  a  partition  on  a  set  D  generated  by  a  semantics-based  equivalence  relation  p  on  D .  D  is  a  subset 
of  the  underlying  database  domain  D.  Since  p  captures  natural  semantic  equivalences  between  domain  elements. 

1  he  equivalence  classes  in  ('  are  sets  of  '‘closely  related”  (indistinguishable)  elements.  The  set  of  all  equivalence 
relations  on  .subsets  of  0  is  denoted  by  Tip.  the  corresponding  context  set  is  Cp- 

( ioiit.exts  can  be  ordered  by  the  equivalences  contained  in  the  generating  equivalence  relations.  A  ‘coarser” 
■qiiivalericp  relation  ronf.ains  more  equivalences  than  a  “finer"'  equivalence  relation  and  yields  a  "coarser’  contex 
with  larger  equivalence  classes. 

Defiiiitioii:  Let  p  and  o'  be  equivalence  relations  in  Tip  Then,  p'  is  coarser  than  p,  i.e.,  p'  Cp  p,  iff  p  C  p'.  If 
f  ■  and  <  ”  are  contexts  induced  by  p  and  p',  respectively,  then  C  is  coarser  than  C,  i.e.,  C  Qc  C. 

An  eqiiivalfiire  'dass  in  a  finer  context  is  a  subset  of  an  equivalence,  class  in  a  coarser  context.  For  exam¬ 
ple.  the  relation  {(a.  h)  (r.  d,  e]}  Qc  {{fc},  {r,  d}}  holds.  On  the  other  hand,  the  contexts  ({a},  {b.  c}}  and 
;  In.  111.  {rll  are  not  comparable. 

U'n.  C,  i  IS  a  compleU  lattice.  The  coarsest  context  \  I))  has  a  single  (largest)  equivalence  class.  The  finest 
■vaciioiis"  ■  oniev!  is  induced  by  the.  empty  equivalence  relation. 

3.2  Domain  Integrity 

When  enforcing  a  domain  integrity  constraint,  the  equivalence  classes  in  a  context  act  as  sieve  openings  controlling 
the  maximuni  imprecision  of  information  chunks  storable  as  tuple  components.  An  imprecise  tuple  component  that 
passes  through  a  sieve  opening  in  a  context  (i.e.,  it  is  a  subset  of  an  equivalence  class)  is  said  to  be  consistent  with 
respect  to  the  context.  ( 'onsistent  comiionents  are  meaningful  because  equivalence  classes  comprise  semantically- 
reiated  elements  The  null  set  is  not  a  consistent  value  in  this  model.  The  maximally  impreci.se  value  from  a 
domain  ft  is  defined  as  H  itself. 

Definition:  An  imprecise  (.set-valued)  component  t,  i.s  consistent  with  respect  to  a  context  Ci  iff  it  is  a  non-empty 
subset  of  an  equivalence  clas.s  in  (q. 

Definition:  .An  imprecise  tuple  t  '=■  [I] .  t2,  ■■■■  In)  consistent  with  respect  to  (fq,  62,  f'n)  iff  each  ti  is 
consistent  with  ncspect  to  context  fy. 


The  ria.ssiral  clomain  integrity  constraint  is  enforced  by  contexts  with  singleton  equivalence  classes.  Only 
atomic  values  arc  consistent  with  respect  In  the.se  “precise  contexts  '  Ooarser  contexts  permit  the  storage  of 
largo'r  information  cininks  as  tuple  components. 


3.3  Equivalence.  Entity  Integrity  and  Imprecise  Relations 

We  now  define  <■>  mtext-based  equivalence  for  imprecise  information  chunks  and  the  related  notion  of  entity  integrity 
for  imprecise  relations  These  definitions  reduce  to  their  classical  counterparts  for  precise  contexts  with  singleton 
equivalence  classes 

Definition:  Imprecise  values  (sets)  t  and  t'  are  equivalent  with  respect  to  a  context  denoted  by  t  ~c  l\  iff  I 
and  t'  are  non-empty  subsets  of  the  same  equivalence  class  in  C . 
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\  ■otitexi  acts  as  a  sipvp  for  determining  the  equivalence  of  information  chunks.  Since  an  equivalence  class 
isiev*  npeniugj  comprises  indistinguishable  elements,  all  consistent  chunks  passing  through  the  same  sieve  opening 
are  ■.onsuiered  equivalent.  For  precise  contexts,  context-based  equivalence  reduces  to  classical  equality.  The 
corresponding  precise  chunks  are  equivalent  only  when  they  are  identical. 

Definition:  Two  imprecise  tuples  t  and  t'  are  redundant  with  respect  to  contexts  C  =  (6'i,C'2 . f’n),  i  e., 

f  '  t'' ,  iff  i,  '-v  ,  t'.  for  eacli  component  i. 

The  classical  entity  integrity  property  requires  that  a  relation  be  a  set  of  tuples.  A  similar  property  is  used  for 
imprecise  databases:  No  two  tuples  in  an  imprecise  relation  may  be  “identical.”  In  this  case,  however,  “identical” 
is  defined  as  equivalence  with  respect  to  contexts,  and  a  “set”  may  not  contain  multiple  equivalent  objects. 

Definition:  An  zmprertsr  relation  sche7ne  Riax:"!  >s  a  collection  of  attributes  A  —  {A\,  ,42,  ...,  An)  with  associated 
contexts  (  ’  “  (Jt;  .  ('  2,  (  ';)■ 

Definition:  ,4n  imprecise  database  relation  r  with  underlying  scheme  R(a.C)  is  a  set  of  oon-redundant  tuples 
with  res)>pct  to  the  contexts  in  C. 


.\n  imp.rerise  relation  scheme  is  a  collection  of  attributes  and  associated  contexts.  Since  a  classical  relation 
scherne  has  precise  rontext.s  on  all  its  attributes,  a  classical  database  relation  can  only  hold  precise  information. 
An  impre.'isp  r'^lation  seheme  has  coarser  contexts  on  some  nr  all  of  its  attributes.  This  enables  the  corresponding 
ininrecisf  relation  to  fonsistently  hold  imprecise  information  chunks. 

Entity  integrity  is  preserved  by  subsnimng  redundant  tuples.  In  a  classical  relation,  each  set  of  identical 
tuples  i.s  simply  replaced  by  one  of  the  tuples  However,  extra  consideration  must  be  given  to  an  impreci.sc 
relation  heransf  it  is  possible  to  have  redundant  imprecise  tuples  which  are  different  from  each  other.  Entity 
integrity  is  maintained  in  an  imprecise  {or  precise)  relation  by  “merging”  a  block  of  redundant  tuples  into  a  single 
nnn-redunda.nt  tuple. 

Definition:  The  merge  of  two  tuples  /  and  /'  is  .7  (  7,|,  11.2 _ ?«„)  where  Ui  —  ti  U 

1  he  merge  oi'eration  has  an  important  role  m  imprecise  inference  analysis,  especially  for  generating  imprecise 
relations  t.o  sal.isfy  mdnred  FDs  (recall  the  example  in  f  h(  previous  section).  Imprecise  inference  analysis  is 
discussed  in  detail  in  Seriiun  4. 


Name 

h.xprririicf 

Salary 

{John } 

[7  10) 

{65  A) 

[Bill] 

{21 

[25 A,  4f)A) 

An  example  imprecise  relation  is  presented  above.  ll  is  defined  with  respect  to  a  precise  context  on  the  Name. 
attribute,  and  coarser  rniitexts  {[0,  5),  [5.  10)}  and  {[OA,  uOA  ),  [50 A  ,  lOOA')},  on  the  Experience  and  Salary 
attributes  respei't  ively.  The  tuple.s  in  the  relation  are  consistent  and  rion-redundant  with  respect  to  these  contexts. 
However,  the  tuples  lieconie  redundant  if  the  coarser  contexts,  {Men's  iVnmes},  {[0,  10)},  and  {[OA',  lOOA')}, 
are  used  for  the  \nmr_  Idxperience  and  Salary  attributes,  respectively.  The  new  merged  relation  contains  a 
single  tuple  obtained  by  taking  the  set  union  of  corres]30iKling  components  in  the  redundant  tuples.  Note  that 
precise  informat  ion  is  correctly  expressed  using  singleton  sets.  However,  for  simplicity  we  will  represent  precise 
information  as  atomic  values  in  the  following  sections. 
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4  Imprecise  FDs  and  Imprecise  Inference  Channels 


This  section  introduces  the  main  concepts  necessary  for  analyzing  imprecise  inference.  It  clarifies  the  notion 
of  an  imprecise  FD  and  presents  inference  axioms  for  determining  closure.  Two  of  these  axioms  correspond  to 
abduction  and  partial  deduction.  Finally,  this  section  defines  the  notion  of  an  imprecise  inference  channel.  Since 
the  definitions  generalize  their  classical  counterparts,  the  imprecise  inference  methodology  also  can  be  used  for 
analyzing  precise  inference. 

The  existence  of  an  imprecise  FD  implies  that  if  some  tuple  components  satisfy  certain  equivalences,  then 
other  tuple  components  must  exist  and  their  values  must  be  equivalent  [7,17].  This  notion  extends  the  equality- 
Siased  classical  FD  tn  one  based  on  equivalence  with  respect  to  contexts.  Imprecise  FDs  can  specify  constraints 
on  precise  and  imprecise  data.  Examples  of  imprecise  FDs  are;  Engineers  have  starting  salaries  of  about  J,0K. 
and  -\pproxnnntely  equal  qualifications  and  more  or  less  equal  experience  demand  similar  salary.  Cdearly.  such 
FD«  have  a  key  role  in  imprecise  inference  analysis 

Defimtioii:  An  imprecise  FD:  X(Cx)  V'(CV)  holds  in  scheme  R(a.C)  iff  for  all  tuples  t,  t'  in  every  extension 
>■  of  t  X  implies  I  ~Cv  I' ■ 

Lemma  4.1:  The  imprecise  FD;  A;(rA  )  >''(('■>  )  is  a  classical  FD  when  Cx  and  Cy  are  precise  contexts. 

Determining  the  closure  of  inference  for  a  set  of  imprecise  FDs  is  critical  to  analyzing  imprecise  inference.  The 
classical  mode!  uses  Armstrong’s  axioms  in  defining  the  FD-based  inference  closure.  Counterparts  to  Armstrong’s 
axioms  exist  for  context-based  imprecise  FDs  [17]. 

Lemma  4.2:  Irnprecisi'  FDs  satisfy  Armstrong’s  Axioms: 

V'((  V  )  C  A(Ca)  C  U(C,,)  implies  X(Cx)  —  E(CV  )  (reflexivity) 

A  iCx  i  -  TirS  )  implies  XZ{Cxf  'z)  yZiCyCz)  "’here  Z{Cz)  C  U{Cu)  (augmentation) 

ATCv)  ^  T(f 'v  )  and  YiCy)  Z{Cz)  imply  A(r'x  )  Z{Cz)  (transitivity). 

In  addition  to  the  counterparts  to  Armstrong’s  axioms,  new  axioms  specific,  to  imprecise  FDs  exist  [21].  The 
following  inference  axiom  states  that  if  an  information  chunk  x  determines  some  chunk  y  using  an  imprecise  FD, 
then  information  more  precise  than  x  can  determine  information  less  precise  than  y. 

Lfeiiima  4.3:  .V  ((  "x  'i  — ■  1  ( t  y  )  implies  .\  ( t  "x  )  — •  V  (C’y  )  for  all  (  ^  Cc  C  x  i  Cfy  Qr  (  y  ■ 

As  seen  in  Section  2.  deductive  and  abductive  inference  can  be  unified  using  the  idea  of  an  induced  FD.  In 

the  context  rnoilcl,  an  induced  FD  is  an  imprecise  FD  defined  in  terms  of  “induced  contexts’  [21].  For  a  relation 

"ontaining  attrdnites  A'  and  V  ,  the  induced  context  is  the  finest  context  for  Y  that  yields  an  imprecise  FD  from 
V  to  V  ivith  coiii.exi  ('a  for  A'.  The  induced  context  is  computed  as  the  greatest  lower  bound  (gib)  of  equivalence 
classes  merged  iii  V  according  to  values  in  .-V.  It  represents  the  greatest  lower  bound  on  the  inferred  information 
and  always  exisf,'^  because  the  set  of  contexts  is  a  completi’  lattice. 

Definition:  bet  F  be  the  set  function  induced  by  the  mapping  A'  — ^  Y  from  X  tuple  components  to  V  compo¬ 
nents  and  let  (t[A  |)^  v  be  the  set  of  tuple  components  that  are  equivalent  to  t[X]  with  respect  to  Cx-  An  induced 
context  on  V'.  denoted  by  Jx-~y{Clx)-  is  constructed  by  making  the  set  of  Y  tuple  components  in  F{{t\X])cx) 
redundant  for  each  /[A']  and  then  taking  the  greatest  lower  bound  [gib)  of  this  collection  of  contexts. 

Having  clarified  the  idea  of  an  induced  context,  we  now  present  the  inference  axioms  for  abduction  and  partial 
deduction  (Lemmas 4.4  and  4.5,  respectively).  Ly'  in  Lemma 4.5  denotes  the  coarsest  context  for  Y .  It  is  required 
because  information  about  Y'  is  not  used  in  partial  deduction. 
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XiJy and  for  any  C'x  such  that  V ((-V)  ! 


Lemma  4.4;  A  (f  ‘x  )  v'(CV)  implies  yiTV  1 
holds.  (  \  L_c  ly.^xi<  y]^ 

Lemma  4.5:  Xy(('x(.’Y)  —*  Z(Cz)  implies  A'i(df)  ZiJxY -^z{Cx -Xy))- 

Fro ‘If  -  nitoTPurc  channels  are  composed  of  a  ‘hise  set"  of  FDs  [12],  Likewise,  imprecise  inference  channels 
are  c.imposefi  of  a  use  .set  of  imprecise  F’Ds  An  imprecise  inference,  channel  is  a  chain  of  imprecise  FDs,  usually 
represented  as  a  Hirected  acyclic,  graph  (dag) 

Definition:  An  ixnprpctsc  inference  channel  from  X(Y  a'.)  to  Z(Cz)  is  a  sequence  of  imprecise  FDs,  T  = 
r, ,  F-o  .,  F„.  such  that  X(Cx)  ^  LHSiF:)-  Z{('z)  e-  and  RHS{Fi)  =  LHS(F,+^). 

The  notion  ot  an  imprecise  inference  channel  and  its  role  in  inference  analysis  and  control  are  clarified  using 
a  series  of  examples  in  the  following  section 


5  Imprecise  Inference  Control  Methodology 

1  his  spctmji  (lutlines  t[if.  irasic  imprecise  inference  control  methodology.  Imprecise  inference  analysis  and  its 
application  in  a  practical  inference  control  methodology  are  described  in  detail. 

5.1  Analyzing  Imprecise  Inference 

The  definitions  m  the  previous  section  provide  the  formal  mechanism  necessary  for  the  rigorous  analysis  of 
mqirerise  inference.  Imprecise  inference  analysi.s  primarily  involves  the  determination  of  the  inference  closure. 
This  closure  is  ronipiited  using  the  inference  axioms  defined  in  the  previous  section.  The  inference  closure  can 
be  visiializ<ai  la  niaterializmg  imprecise  relations  which  correspond  to  the  FDs  induced  within  relations  and 
arro.ss  relations  with  seniaiitically  equivalent  attributes.  T  he  imprecise  relations  then  are  composed  to  create  new 
relations  correspoiidiiig  to  potentially  compromising  infereace  channels. 

Hypergraphs  urc  used  to  represent  FIT-hased  inference  channels.  The  formal  definition  of  a  hypergraph  used 
in  our  work  is  given  hi'low 

Definition:  A.  hypenjraph  is  a  collection  of  edge.s  T/./  and  vertices  Vfj  —  '2^  on  some  basis  set  S.  A  vertex  is  a 
subset  of  ,s  and  an  edge  is  a  [lair  of  vertices 


(.iveii  a  relation  and  a  set  of  FDs  (preferably  a  minmniiu  cover),  it  is  a  simple  task  to  derive  a  hypergraph 
whose  edges  conned  vertices  containing  set.s  of  al.triinites.  Such  a  hypergraph  facilitates  exploration  of  the 
interi'iice  ciosiir-  ‘or  the  database  being  analyzed.  VVe-  present  a  simple  algorithm  for  constructing  a  hypergraph 
froin  a  glolia!  reiaiion  schenif’  and  ;i  set  of  FDs, 


Hypergrapli  Canstruf.tion  Algorithm 


(t:  ( ilohal  scheme;  R:  Relation  scheme:  F:  Set,  of  FDs;  A:  Relational  attribute 
y R  G  (r,  V/1  G.  R'  Instantiate  a  node  labelled  Ayy- 
V  FD- Y  -  V'  e  F: 

( Ireate  a  hypernode  referencing  all  nodes  with  attributes  in  X  if  no  such  hypernode  exists, 
f  reat.e  a  hypernode  referencing  all  nodes  with  attributes  in  Y  if  no  such  hypernode  exists. 
Add  a  hyperedge  between  these  two  hypernodes. 


Note  that  the  hyperedges  are  bidirectional;  thus,  they  capture  abductive  as  well  as  deductive  paths.  The 
constructed  hypergraph  expresses  all  inference  paths  in  the  database  system.  It  must  be  analyzed  to  detect  all 
potentially  compromising  inference  channels. 

The  following  simple  example  illustrates  imprecise  inference  analysis  and  clarifies  many  of  the  previously- 
defined  concepts,  including  context-based  imprecise  relation,  imprecise  FD  and  imprecise  inference  channel,  Clon- 
sider  a  database  containing  the  three  relations,  Ri(Empi,  Sal\),  R2{Sal2,  Tax2)  and  R3{Emps,  Tax^).  Note 
that  /?,)  and  R2  are  imprecise  relations  while  R3  is  a  precise  relation. 


R.]{Empi,  Sail) 


Empi 

Sail 

./  aim 

|0A'.  .bOA') 

M  ary 

[.50 A  lOOA') 

J  Of 

[OK,  .50A') 

Jam 

[OA',  50 A) 

R2{Sal2.  Tax2) 


Sah 

Tax2 

[OA',  lOA) 

10% 

[lOA',  20 A') 

1.5% 

[20 AT  .35 A') 

18% 

[35A',  50A') 

25% 

[50A',  80 A') 

30% 

[80 A"  lOOA') 

35%. 

Rz(Emp3,  Taxz) 


Empz 

Taxz 

John 

18% 

Mary 

30% 

Joe 

25% 

Jane 

18%, 

Relation  Hi  expresses  “common  knowledge”  possessed  by  low  users  that  salaries  are  in  the  [OA  ,  50A')  or  [50A’, 
lOOA  )  range.  This  relation  corresponds  to  “catalytic  data”  [8]  added  to  a  database  by  the  DBA  to  materialize 
imprecise  inference  channels  due  to  common  knowledge.  Tlie  coarse  context  Csah  —  {[0^^  >  bOA'),  [50A',  lOOA")} 
enables  imprecise  salaries  to  be  stored  consistently  in  A) .  A  precise  context  is  used  for  attribute  Empi .  The 
imprecise  FD:  Empi  Sail  holds  in  Ri  for  contexts  ('Emp,  and  CsaU-  (A  precise  FD  would  have  held  between 
the  same  attributes  had  the  relation  Ri  been  precise.)  .Since  this  example  deals  with  the  inference  of  salary  values, 
let  the  inference  channel  ICh  =  {Empi  — >  Sati }.  Ko  is  a  use  set  containing  a  single  imprecise  FD. 

Relation  R2  is  a  tax  table  This  table  could  already  exist  in  the  database  or  it  could  be  added  to  the  database 
as  a  catalytic  relation  conveying  common  knowledge.  The  table  contains  imprecise  information  in  that  multiple 
salary  values  map  to  a  single  tax  rate;  it  could  he  expressed  less  concisely  by  a  table  containing  only  precise 
values.  The  context  ('sai-,  =  {[OA',  lOA'),  [lOA,  20A')  [20A'.  35A'),  [35A',  ,50A:),  [50A',  80A'),  [80A',  lOOA')}  is 
used  for  the  salary  attribute.  The  context  used  for  Taxz  is  precise  The  imprecise  FD:  Sal2  — +  Tax2  holds  in  R2 
for  contexts  Csat^  and  ('Tax., 

Relation  A3  contains  precise  information  and  uses  precise  contexts  for  all  its  attributes.  The  precise  FD; 
Empz  --  Tax.i  holds  in  A.3 

The  inference  axioms  produce  the  channel  Ho,.  {Empi  —*  Empz,  Emp^  — >  Taxz,  Taxs  TaX2,  Taxz  — »■ 
SaE.  Sal^  Soh  )  Notice  that  some  FDs  in  the  use  set  arise  from  the  semantic  equivalence  of  attributes,  e.g., 
FD:  f'mpi  —  t'inpz.  The  salary  information  interred  through  inference  channel  ILE  is  presented  in  Relation  R1C2 
below  The  new  relation  is  obtained  by  loining  relations  A3  and  A2  as  specified  in  the  use  set  for  1(12- 

R n  -..{Emp.  Sal) 


Emp 

Sal 

J  oh  n 

[20 A',  35 A) 

M  ary 

[.50 A,  80 A) 

Joe 

[35 A.  .50 A) 

Jane 

[20 A,  .35 A) 

We  can  .see  that  a  potential  imprecise  inference  compromise  exists  because  the  derived  relation  Ric^  has  salary 
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information  of  finer  granularity  than  the  original  relation  R)  More  precisely,  the  compromise  exists  because  the 
derived  inference  channel  IC2  is  not  coarser  than  the  original  channel  ICi-  The  induced  context  Csai  in  R1C2 
(and  I('2j  IS  equal  to  {[20A',  35K),  [35 /l,  50A),  f50/v.  80/<')}.  It  is  not  coarser  than  Csaii  =  {[O/i,  50A'),  [50Ad 
IflOA')}  in  A,  (and  IC',), 

5.2  Controlling  Imprecise  Inference 

To  control  and  ultimately  eliminate  compromising  imprecise  inferences,  it  is  necessary  to  specify  a  set  of  imprecise 
inference  channels  considered  to  be  secure  This  rompromt.‘if  sprcification  set  is  defined  by  the  DBA.  Suspect 
channels  in  a  database  are  compared  with  these  secure  channels.  A  potential  security  compromise  exists  when  a 
suspect  rhannei  .allows  the  inference  of  information  which  is  riot  coarser  than  information  inferred  by  any  secure 
channel  with  a  finer  R,H.8  context.  Potentially  corniironiising  imi>recise  inference  channels  are  eliminated  by  hiding 
relational  atti-ihntes  or  liy  'clouding”  data  manifested  by  imprecise  FDs  in  the  compromising  channels. 

Tbc  following  algorithm  describes  the  basic  infer, -uice  control  methodology.  Note  that  the  compromise  spec- 
iiiratioD  sei  !.>■  to  i)(>  defined  by  the  DBA  The  definitions  of  rnmprnrnise{)  and  eliminate{),  which  detect  and 
eliminate  potentially  compromising  inference  channels,  respectively,  will  be  provided  later. 

Inference  Control  Algoritlim 
Define  conipromise  specification  sei 

(  'onstruct  hypergraph  from  global  schema  ((  and  FD  set  F. 

P  Set  of  paths  in  hypergraph 

For  each  pi  in  /'  If  compr(>mise{pi  ),  then  climinate{pi) . 


5.2.1  Conipromise  Specification  and  Detection 

In  a  typical  scenario,  the  DBA  first  identifies  the  set  of  imprecise  inference  channels  called  the  compromise 
spectfirahoii  set  T  ius  .sel  conveys  the  finest  information  that  can  be  inferred  without  compromising  database 
security.  It  tyjiicallv  contains  simple  channels  composed  of  FD.s  holding  in  the  database  relations  being  analyzed 
and  those  holding  in  additional  catalytic  relations.  .As  mentioned  earlier,  catalytic  relations  [8]  are  additional 
relations  added  to  i.fie  database  being  analyzed.  They  convey  common  knowledge  about  secure  attribute  values 
and  are  ii.sed  tc.  materialize  potentially  compromising  imprecise  inference.s  that  might  otherwdse  go  unnoticed.  A 
suitable  assiiiiiptiou  used  when  constructing  the  rompromi.se  siiecification  set  is  that  imprecise  FDs  and  inference 
channels  'ryitfun  a  relation  are  secure,  f  loniproniising  inference.s  across  relations  can  occur  when  attributes  in 
differeni  relations  ,nre  semantically  e(|uivaleiit,,  A  siiriilar  assumption  is  used  in  the  DISSECT  system. 

’V\'<  ex.a.imiK  individual  inference  ]iat,hs  in  tlie  hypergraph  constructed  from  inference  analysis  for  suspected 
comproinise.  Fu  interenn'  path  exploration,  the  hvpergraph  can  be  reduced  to  a  regular  graph  whose  vertices  are 
iivfterdges  [21],  That  is,  a  hyperedge  A(„ j,,  with  hypernode.s  u  and  h  becomes  a  vertex  in  a  regular  graph. 

I  wo  such  vertices.  I  f,-,  and  V ,  are  connected  if  the  intersection  of  a  hypernode  of  attributes  from  Vg,  and  a 
hypernode  of  attributes  trorri  Ve^  is  nonempty.  Finding  inference  channels  to  test  for  potential  compromise  now 
reduces  to  path  enmneration  for  regular  graphs.  The  algorithm  for  hypergraph  reduction  is  presented  below. 

Hypergraph  Reduction  Algorithm 

A//:  Set  of  hypergrajih  edges:  Vg:  Set  of  hypergraph  vertices 

Efi'.  Set  of  regular  graph  eilges:  V’r:  Set  of  regular  graph  vertices 

For  each  E^a,b)  G  A>:  Add  l.o  Ig. 

For  each  VE^^^,y  Vg;,,,)  m  Vr. 

If  {fi  U  6}  n  {c  U  ^  0  then  add  to  Er. 
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FDIink 

s'emiiTtic  equivience 


Figure  1 :  Example  Hypergraph 


FDIink 

s’eminlc  equivaiaice 


Figure  2:  Inference  Channel:  IC2 

A  suspect  inference  channel  originating  at  X  that  allows  the  inference  of  information  about  Y  can  be  deemed 
secure  when  that  information  is  coarser  than  the  inferred  information  of  any  inference  channel  from  X  ^  Y  m 
the  compromise  specification  set.  On  the  other  hand,  a  potential  security  compromise  exists,  i.e.,  compromise(pi) 
IS  true,  when  there  does  not  exist  a  secure  inference  channel  in  the  compromise  specification  set  such  that  the 
information  inferred  from  the  suspect  channel  is  coarser  than  the  information  inferred  from  the  secure  channel. 
We  use  the  context  model  to  formalize  the  notion  of  a  potentially  compromising  imprecise  inference  channel  that 
IS  used  in  our  inference  control  algorithm. 

Definition:  A  derived  inference  channel  I("  for  A(f.'x)  )  is  a  pote.7itially  compromising  imprecise 

inference  channel,  i.e,.  compromise{IC'  \  is  true,  iff  ( is  not  coarser  than  Cy ,  i.e.,  Cy  Qc  Cy  does  not  hold, 
for  some  l('  for  .\  (t  ',v  )  — ^  F  (CV)  in  the  compromise  speciticatioii  set 

For  the  example  in  the  (irevious  subsection,  the  com|)romise  specification  set  is  equal  to  {Empi  Sal], 
Sain  — *  Taxn,  Emp^  — ^  Tax,]] The  inferences  possible  in  the  example  database  are  represented  in  the  hypergraph 
in  Figure  1.  A  potentially  compromising  inference  channel  exists  because  the  derived  channel  IC2:  {Empi 
E  mpz.  Empn.  — >  Taxa,  7'nx,^  — »  Tax2,  I'nx^  .S’a/2,  Sain  Sal] }  has  a  context  for  Csah  which  is  not  coarser 
than  the  corresponding  context  for  inference  channel  K^:  i  FJmp]  — »  ,S'a/i}  in  the  compromise  specification  set. 
The  derived  inference  channel  I(.l2  is  pre.sented  in  Figure.  2. 

Detecting  imprecise  inference  compromises  is  an  NF’-complete  problem  [21].  This  follows  from  the  fact  that 
imprecise  inference  compromise  detection  generalizes  the  problem  of  detecting  precise  FD-based  inference  chan¬ 
nels:  the  latter  was  shown  to  be  NP-cornplete  by  Su  and  Ozsoyoghi  [19,20].  Exhaustive  search  is  the  obvious 
strategy  for  detecting  compromising  channels.  Large  databases  will  require  the  use  of  heuristics  for  compromise 
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!•■'  ’l  iir'ii  FtlV-.  ii'.  '  aru!  practical  heuristics  for  fninprorruse  detection  are  presented  in  [21],  Allowing  for  user 
u’teraci.ior;  during  the  detection  phase  enhances  tlie  qualit.v  of  the  search. 

5.2.2  Compromise  Elimination 

"'■ciirine  a  (iataiiasc  systein  from  imprecise  inference  attacks  liaseci  on  FDs  mandates  securing  all  potentially  eorn- 
promi.smg  impreciRc  inference  channels  Each  channel  can  he  secured  by  hiding  (upgrading)  relational  attributes 
i!8-2b]  .5  Iternarively.  the  Compromising  mferencc  channels  may  be  secured  by  appropriately  “clouding”  tuple 

•■nmpoueiiis  ile  17,211  (iiformation  used  m  a  potentiaJly  compromising  inference  channel  is  clouded  so  that  the 
inferre.d  infofi;iat,inn  is  coarser  than  that  inferred  using  any  related  secure  inference  channel.  When  a  su.spect 
clianiiei  has  no  sf^ciire  coiinterpart  in  the  compromise  specification  set,  it  must  be  secured  according  to  policies 
srienfieil  h*.  tic  i.iBA 

In  ■'■iir  exanii'h  it  is  necessary  onl>  to  secun  the  inference  channel  IC2.  To  illustrate  the  technique,  wi' 
senir  '  *he  riataiiasf'  ii.sins  information  clouding.  Thif.  is  accomplished  by  applying  the  finest  clouding  of  the 
rh,aiinel  i.o  eimimate  the  compromise.  Selecting  the  channels  to  cloud  and  determining  how  much  to  cloud  is 
an  NP-corniuete  prohiem  but  for  short  inference  c.liannels  a  simple  heuristic  suffices  [7.21].  According  to  this 
hcnrislii  tin  inference  channel  is  traced  iiackwards  and  mformation  in  the  first  acceptable  attriliute  is  clouded 
\m  attribute  u'  an  imprecise  FD  is  not  acceiiiahie  for  r.loudiii.c  if  the  mapping  is  common  knowledge  (e.g..  FP 
^ai-  -  1  <u--!  '('ore.sents  i.lie  tax  table)  or  if  t  he  dependenev  arises  due  to  the  semantic  equivalence  of  ar, tributes 
te.g..  P!)  ;7?(i?ii  —  /vJu.pa)  The  clouding  algorithm  iire.sented  below  effectively  secures  compromising  inference 
cliannels  It.  is  an  apiiropriatc  specification  of  tin'  function  f iiwmate{)  in  the  inference  control  algorithm. 

Cloncliiig  Algorii.lmi 


.1'  (A”] , ....  An):  (  omprornising  inference  channel 

I  \y.  (dontexi  for  attribute  \. 

I  dioose  y  =  i,\  ,  .  A,„)  to  be  a  i-eiated  secure  channel 

i  where  A'„  and  A„,  refer  r.o  t  he  same  attibute). 

Set  ?'  =  71. 

while  cannot  cloud  \ 1  —  1 

Set  CfA:,)  =  I'vhcn-  .3  ::r  (.Y.: .  V,)). 


Applviiiij  this  heiinstn'  to  onr  exaniph'.  wc  srh  -i  nltnhiile  f  icr.a  in  the  F'D:  Emp-j  Tax:^  for  clouding  (or 
iiidingi-  VVe  cloud  it  by  dei.ermining  what  can  Ik  mferred  aliout  T'lWa  using  the  inverse  channel.  ICj':  w'hich 
(iiler.s  EniV]  from  •'.at-,  using  the  imprecise  values  known  for  Stti-,.  i.e..  [O/v,  50/f),  [50 A  ,  lOOA).  This  inference 
channei  yields  iinprccis,  la.x  values  of  (lOA.  i.a'ci,  'IfEA  1  and  (30%,  35%}  from  imprecise  salary  values  of 
[OA,  .i.'lh  :  and  [.'id  A  .  lOD/'  1  respectively  l  lie  n  lation  Ity  Emp-.i.  TVu's)  is  coarsened  appropriately  to  yield  the 
.  !oud(  d  rr-iat  ion  R*^  hei.  iw 

i  iixa) 


7  ri'./'M 

.1  nini 

{  10%.  15%,.  lh%,,  25%} 

.V/  ari/ 

{30%,.  35%,  [ 

Joe 

■110%,.  !5%,  lh%,  25%} 

.la/if 

{10%  \f>%.  18%,  25%} 

The  final  database  contains  relation  K\{E:rnp\.  Sal\ )  -epresenting  a  priori  knowledge,  the  tax  table  R2{Sal2. 
Tax2)-  and  A':) { Ani/yi .  Tax..])  in  which  Tax.']  values  are  clouded  to  eliminate  the  imprecise  inference  channel.  Note 


13 


f  liar  in  the  new  database  the  inference  channel  IC2  yields  information  which  is  coarser  than  that  provided  by  the 
secure  channel  1( This  verifies  that  the  channel  is  secure.  The  secured  database  is  shown  below. 


R\{Empi,  Sail) 


Emp\ 

Sail 

John 

[OA',  50 A') 

Mary 

[50  A'.  100  a:) 

Joe 

[OA',  50 A') 

Jane 

[OA,  50 A') 

6  Conclusions 


R2(Sal2,  Tax2) 


Sal2 

Tax2 

[OA,  lOA) 

10% 

[lOA,  20A) 

15% 

[20 A,  35 A) 

[35 A,  50 A) 

25% 

[50A,  80A) 

30% 

[80A,  lOOA) 

35% 

R^iEmpz,  Taxz) 


Emps 

Taxz 

John 

{10%,  15%,  18%,  25%} 

Mary 

{30%,  35%) 

Joe 

{10%,  15%,  18%,  25%} 

Jane 

{10%,  15%,  18%,  25%} 

R.psearch  on  imprecise  inference  by  Morgensteni  [1.1, 14]  and  Garvey  and  hunt  [4,6]  has  highlighted  the  need  to 
develop  formal  yet  practical,  techniques  for  dealing  with  the  difficult  problem  of  imprecise  inference  control, 
The  context  formalism  presented  in  this  paper  irieets  these  requirements.  Contexts  provide  a  powerful  and 
systematic  approach  for  modeling  imprecise  inference.  They  also  define  integrity  constraints  for  imprecise  relations 
and  the  key  notion  of  an  imprecise  FD  which  generalizes  precise  deduction  and  unifies  the  primitive  imprecise 
inference  mechanisms  of  abduction  and  partial  deduction.  Whereas  other  imprecise  (partial)  inference  models 
estimate  the  probability  or  possibility  of  making  inferences,  the  context-based  methodology  examines  the  actual 
imprecise,  values  inferred  in  a  database  extension.  Imprecise  inference  analysis  is  performed  by  transforming  a 
precise  database,  possibly  augmented  with  additional  catalytic  relations  conveying  a  priori  knowledge  available 
to  low  users,  into  an  equivalent  imprecise  database.  This  technique,  which  is  similar  to  those  used  in  deductive 
databases,  effectively  models  the  propagation  of  compromising  imprecise  inferences  in  MLS  databases.  With 
minimal  modifications,  the  same  inference  analysis  technique  can  be  applied  to  knowledge  discovery  or  database 
mining. 

The  inference  control  formalism  described  in  this  paper  provides  a  foundation  for  the  development  of  practical 
inference  controllers  for  database  systems.  The  DISSEICT  prototype  developed  at  SRI  International  [5,15,18] 
demonstrates  the  utility  of  an  interactive  tool  for  controlling  database  inference.  The  context-based  model  is 
particularly  suited  for  implementation  in  an  interactive  environment.  The  flexibility  of  the  model  will  enable  the 
resulting  tools  to  secure  classical,  MLS,  and  imprecise  databases  from  precise  and  imprecise  inference  attacks. 
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ABSTRACT 

This  paper  describes  a  hypersemantic  data  model  for  representing  multilevel  database  applications.  This  data 
model,  which  is  called  multilevel  knowledge  data  model  (MKDM),  incorporates  constructs  from,  both  data 
models  and  knowledge  models.  An  associated  data  definition  language  called  multilevel  knowledge  data  language 
(MKDL)  and  a  graphical  representation  scheme  are  also  described.  Finally,  knowledge  transformation  as  well  as 
inference  analysis  issues  are  discussed.  Our  goal  is  to  develop  a  uniform  representation  and  specification  scheme 
which  can  not  only  be  used  for  inference  analysis,  but  which  can  also  be  transformed  into  other  representation 
schemes  so  that  existing  inference  analysis  tools  can  be  applied. 

1.  INTRODUCTION 

It  is  possible  for  users  of  any  database  management  system  to  draw  inferences  from  the  information  that  they 
obtain  from  the  databases.  The  inferred  knowledge  could  depend  only  on  the  data  obtained  from  the  database 
system  or  it  could  depend  on  some  prior  knowledge  possessed  by  the  user  in  addition  to  the  data  obtained  from 
the  database  system.  The  inference  process  can  be  harmful  if  the  inferred  knowledge  is  something  that  the  user  is 
not  authorized  to  acquire.  Tliat  is,  a  user  acquiring  information  which  he  is  not  authorized  to  know  has  come  to 
be  known  as  the  inference  problem  in  database  security. 

We  are  particularly  interested  in  the  inference  problem  which  occurs  in  a  multilevel  operating  environment.  In 
such  an  environment,  the  users  are  cleared  at  different  security  levels  and  they  access  a  multilevel  database  where 
the  data  is  classified  at  different  sensitivity  levels.  A  multilevel  secure  database  management  system 
(MLS/DBMS)  manages  a  multilevel  database  where  its  users  cannot  access  data  to  which  they  are  not  authorized. 
However,  providing  a  solution  to  the  inference  problem,  where  users  issue  multiple  requests  and  consequently 
infer  unauthorized  knowledge,  is  beyond  the  capability  of  currently  available  MLS/DBMSs. 

Morgeastem  was  the  one  of  the  first  to  investigate  the  inference  problem  for  MLS/DBMSs  [MORG87].  Since 
then,  several  efforts  have  been  reported.  One  of  the  major  approaches  to  handling  the  inference  problem  is  to 
design  the  multilevel  database  in  such  a  way  that  certain  security  violations  are  prevented  (see  for  example  tlie 
work  of  Binns  [BINN92],  Bums  [BURN92],  Hinke  et  al.  [HINK921,  Garvey  et  al.  [GARV92J,  Smith 
[SMIT90I,  and  Thuraisingham  [THUR901).  That  is,  the  security  constraints,  which  are  mles  that  assign  security 
levels  to  the  data,  are  processed  during  multilevel  database  design  euid  subsequently  the  schemas  are  assigned 
appropriate  security  levels.  While  some  of  the  proposed  solutions  focus  on  representing  the  multilevel  database 
application  using  conceptual  structures  developed  for  knowledge-based  system  applications  and  subsequently 
reasoning  about  the  application  using  deduction  techniques  (.see  for  example  [HINK92,  GARV92,  and 
THUR90]),  some  focus  on  developing  tools  which  generate  new  relational  database  schemas  given  the  original 
relational  database  schemas  and  tlie  security  constraints  (see  for  example  [BINNS92]),  and  some  others  are 
proposing  the  use  of  semantic  data  models  developed  for  database  design  to  design  the  multilevel  database  also 
(see  for  example  1BURN881). 

While  the  three  approaches  complement  each  other,  each  of  them  uses  different  representation  schemes  and 
different  inference  analysis  tools.  That  is,  the  approaches  are  highly  specialized  and  therefore,  if  we  use  one  tool 
for  inference  analysis,  then  we  cannot  take  advantage  of  the  useful  features  offered  by  the  other  tools.  Since  no 
tool  can  handle  all  types  of  inference  problems  and  the  problems  handled  by  each  tool  are  not  the  same,  some  of 
the  inference  problems  cannot  be  handled  with  the  current  approaches.  What  we  need  is  a  powerful  uniform 
representation  scheme  which  can  be  transformed  into  specific  representation  schemes  without  much  complexity. 
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Once  this  IS  done,  the  various  inference  analysis  tools  can  be  applied.  This  way  one  can  take  advantage  of  the 
tools  that  have  been  developed  without  having  to  re-invent  the  wheel.  The  purpose  of  this  paper  is  to  describe  the 
essential  points  of  this  uniform  representation  method.  That  is,  we  are  not  interested  in  simply  developing  yet 
another  model  for  representing  the  application  and  conducting  inference  analysis.  Our  goal  is  to  develop  a 
uniform  representation  and  specification  scheme  which  can  not  only  be  used  for  inference  analysis,  but  which 
can  also  be  transformed  into  other  representation  schemes  so  that  existing  inference  analysis  tools  can  be  applied. 

The  data  model  that  is  to  be  used  to  develop  the  uniform  representation  scheme  should  capture  not  only  the 
entities  of  die  application  and  the  relationships  between  them,  but  also  provide  the  meaas  to  specify  constraints, 
heuristics  and  other  complex  relationships.  What  is  needed  is  a  model  for  bridging  the  gap  between  data  and 
knowledge  base  systems.  This  is  because  data  models  ha'^e  traditionally  focused  on  representing  data  while 
knowledge  mixlels  have  focused  on  knowledge  representation  for  expert  systems.  An  integrated  model  would  be 
appropriate  for  capturing  the  structural  properties,  the  semantics,  and  the  constraints  of  tlie  application.  Such 
integrated  models  have  been  called  hypersemantic  data  models  in  tlie  literature  [POTT891  This  paper  discusses 
the  use  of  hypersemantic  data  modeling  for  capturing  the  semantics  as  well  as  the  structural  aspects  of  multilevel 
database  applications,  describes  the  generation  of  multilevel  database  schemas  from  this  representation,  and 
shows  how  inference  analysis  tools  could  be  applied. 


Tie  hyper,scmantic  data  model  that  we  have  developed  is  called  Multilevel  Knowledge  Data  Model  (MKDA.  . 
This  model  extends  Potter  et  al.’s  fPOTT89!  hypersemantic  data  model  for  multilevel  database  applications.  It 
integrates  data  ajvi  knowledge  and  captures  the  static  as  well  as  temporal  aspects  of  the  application.  We  have  tilso 
developed  a  graphical  representation  scheme  called  GRAPHICAL-MKDM  as  well  as  a  specification  language 
called  Multilevel  Kik nvledgc  Data  Langutigc  (MKDL )  for  MKDM.  Tiic  representation  schemes  that  we  have 
developed  are  general  enough  to  be  transfonned  into  conceptual  structures  such  as  semantic  nets,  logic 
programming  hmguages,  as  v/ell  as  SQL.  This  way,  the  inference  analysis  tools  tliat  have  been  developed  for 
other  representation  schemes  could  be  applied.  It  should  also  be  noted  that  inference  analysis  tools  could  be 
developed  for  specifications  in  GRAPHICAL-MKDM  and  MKDL.  Such  tools  could  find  those  potential 
inference  problems  that  can  be  uncovered  with  the  complex  constnicts  of  MKDM  and  which  are  not 
straightforward  with  the  other  representation  schemes. 

The  organization  of  this  paper  is  as  follows.  Tlic  hypersemantic  data  model  that  we  have  developed  as  well  as 
the  associated  specification  language  are  cii.scus.sed  in  Section  2.  In  particular,  the  essential  constructs  of  MKDM  . 
MKDL  (the  language  tor  .specifying  MKDM),  and  a  graphical  representation  scheme  are  described.  In  Section  ? 
w'C  describe  how  tlx'  graphical  representation  scheme  and  MKDL  specification  can  be  transformed  into  some  of 
the  representation  schemes  proposed  by  others  in  the  literature.  In  particular,  we  show  how  MKDM-based 
representation  nl  an  application  can  be  transformed  into  representations  based  on  conceptual  structures,  logic 
progra.mming  specification,  and  extended  SQL  specification.  Tlie  types  of  inferences  to  be  handled  as  well  as 
applying  different  inference  analysis  tools  on  the  various  representation  schemes  of  the  application  are  described 
in  Section  4  Related  work  is  discussed  in  Section  3.  Tlie  paper  is  concluded  in  Section  6  with  a  discussion  of 
future  considerations, 

2.  DETAILS  OF  THE  HYPERSEMANTIC  DATA  MODEL 

Tliis  section  describes  the  details  of  the  hypersemantic  data  model  that  we  have  developed.  In  section  2.1  wo 
describe  MKDM,  In  section  2.2,  we  describe  MKDL.  Our  graphical  representation  scheme  is  described  in 
section  2,3. 

2.1  MULTI!  EVEL  KNOWLEDGE  DATA  MODEL 

2.1.1  OVERVIEW 

in  this  section,  we  describe  the  mcxlel  that  we  have  developed  for  representing  multilevel  database 
applications.  This  model,  which  incorporates  constructs  from  .semantic  data  models  and  knowledge  models,  is 
called  a  multilevel  knowledge  data  model  (MKDM).  It  incorporates  data,  knowledge,  and  security  semantics  of 
an  application.  As  in  most  semantic  models,  we  use  the  notion  of  an  object  to  represent  any  structural  entity  in  the 
application.  Such  an  entity  could  be  a  person  Joe,  or  a  dog  Lassie,  or  an  airplane  AAA.  Each  entity  consists  of  a 
set  of  properties  which  describe  that  entity.  The  essential  constructs  of  the  model  are  tlie  following: 
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(i)  Classification;  objects  with  similar  properties  (which  are  also  called  attributes)  are  grouped  into  an 
object-type  via  the  "instance-of  relationship. 

(ii)  Generalization:  a  subset  of  the  objects  of  an  object-type  which  have  common  properties  are  grouped  into 
another  object-type  via  the  "is-a"  relationship. 

(iii)  Aggregation:  an  object  is  composed  into  multiple  components  via  the  "is-part-of '  relationship. 

(iv)  Membership:  a  collection  of  object-types  are  abstracted  into  a  new  object  type  via  the  "is-a-member-of ' 
relationship.  An  instance  of  this  new  type  wiU  consist  of  a  collection  of  objects,  one  from  each  object  type 
which  formed  the  new  type. 

(v)  General  constraints:  a  restriction  is  placed  on  some  aspect  of  an  object  (such  as  the  value  of  its 
property),  operation,  or  relationship  via  the  "is-constraint-on"  relationship. 

(vi)  Heuristics:  information  derivation  mechanism  is  attached  via  the  "is-heuristics-on"  relationship. ' 

(vii)  Temporal:  specific  object  types  are  related  by  synchronous  or  asynchronous  relationships. 

(viii)  Security  constraints:  a  restriction  is  placed  on  the  security  level  of  an  object,  an  object-type,  an 
attribute,  or  on  the  association  between  the  property  of  an  object  and  the  value  of  this  property.  The 
assignment  of  the  security  level  may  depend  on  the  content,  context,  or  time.  Each  classification  constraint 
may  have  a  corresponding  explanation  for  the  restriction. 

The  following  example  illustrates  briefly  the  essential  points  in  the  modeling  constmcts.  An  example  of  the 
classification  construct  is  the  grouping  of  all  students  into  a  STUDENT  type.  An  example  of  the  generalization 
construct  is  grouping  all  smdents  over  age  25  into  a  type  called  ADULT-STUDENT.  An  example  of  an 
aggregation  constmct  is  a  book  MATH-A  which  consists  of  the  components  INTRODUCTION,  CALCULUS, 
ALGEBRA,  GEOMETRY,  and  CONCLUSION.  An  example  of  membership  construct  is  (STUDENT, 
TEACHER)  whose  instances  are  the  (student,  teacher)  pairs.  An  example  of  a  general  constraint  is  a  rule  that 
each  student  is  bounded  by  the  number  of  courses  taken  depending  on  his  year.  An  example  of  a  heuristic 
construct  is  the  inference  rule  that  if  a  student  is  in  his  senior  year,  then  he  must  have  taken  at  least  20  courses. 

An  example  of  a  temporal  construct  is  that  before  a  student  starts  his  thesis  he  must  finish  his  qualifying  exams 
and  his  oral  exams.  An  example  of  a  security  constraint  is  assigning  the  Secret  level  to  the  association  between 
the  GRADE  property  of  the  STUDENT  object-typx;  and  its  value. 

In  sections  2. 1 .2.  to  2. 1 .9  we  will  discuss  the  details  of  each  construct  with  examples.  It  should  be  noted 
that  security  constraints  are  a  special  type  of  construct.  They  are  used  in  the  inference  analysis  process  during  the 
design  of  the  application.  They  are  also  used  by  the  inference  analysis  tools  to  generate  the  database  schema. 

2.1.2  CLASSIFICATION  CONSTRUCT 

The  purpose  of  MKDM  is  to  define  the  objects,  its  properties,  and  the  associated  security  levels.  Each  object 
has  associated  with  it  a  security  level  which  we  assume  is  the  existence  level  of  the  object.  That  is,  if  an  object's 
level  is  L,  then  its  existence  is  known  at  level  L  or  higher. 

The  classification  construct  is  used  to  group  objects  with  similar  properties  into  an  object-type.  Now,  the 
question  is,  should  an  object-type  have  a  security  level?  Since  objects  have  levels,  it  is  reasonable  to  assume  that 
an  object-type  which  groups  objects  also  has  a  level.  This  level  is  the  existence  level  of  the  object-type.  The 
relationship  between  the  levels  of  the  object  and  its  object-  type  should  be  such  that  it  is  not  possible  to  have  a 
security  violation  via  inference.  For  example,  if  an  object's  level  is  LI,  its  object-type's  level  is  L2,  and  LI  <  L2, 
then  one  could  infer  at  LI  information  about  the  object -type.  So,  it  would  be  safer  to  have  LI  >  L2. 


^Integrity  constraint.s  are  a  form  of  general  con.straints.  General  constraints  specify  causal  relationships  while  heuristics  do  not.  An 
example  of  a  general  constraint  is  "if  a  student's  GPA  is  less  than  2.0,  then  he  cannot  take  Math  231".  There  is  a  cause  for  a  student 
not  to  take  Math  231 .  An  example  of  a  heuristic  constraint  is  "  if  a  student  is  in  his  Senior  year,  then  he  must  have  taken  20 
courses.  Here,  the  fact  that  a  student  has  taken  20  courses  has  nothing  to  do  with  him  being  in  his  senior  year. 
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A  model  should  also  capture  the  properties  of  objects  which  are  specified  in  the  object-type  classifying  that 
objcct^  Nov.\  what  should  be  the  relationship  between  the  level  of  an  object-type  and  the  level  of  its  attributes? 
What  is  the  connection  between  the  level  of  an  attribute  and  the  level  of  its  values?^  It  is  felt  that  at  higher  levels 
one  could  have  information  about  additional  attributes  for  a  class.  Also,  if  the  level  LI  of  the  attribute  is 
dominated  by  the  level  L2  of  the  object-type,  tlien  at  level  LI,  one  could  infer  the  existence  of  the  object-type. 
Therefore,  it  is  safer  to  have  the  property  that  L2  <  LL  Now,  knowing  that  A  is  an  attribute  of  object-type  O 
does  not  mean  that  one  can  read  the  value  of  the  attribute.  So,  the  level  of  the  value  (which  is  actually  the  level  of 
the  asscxiation  between  the  attribute  and  its  value)  dominates  the  level  of  the  attribute.^  Otherwise,  one  could 
read  die  value  and  infer  that  there  is  an  attribute.  Tlie  essenticd  points  are  illustrated  in  figure  1 .  STUDENT  class 
is  Lnclassified  with  inshmces  which  are  Unclassified  and  Secret.  The  attributes  Name  and  GPA  are  Unclassified 
while  the  attribute  Bank-account  is  Secret.  But  the  value  of  the.  GPA  attribute  (i.e.,  the  a.ssociation  between  the 
GPA  attribute  and  its  value)  is  Secret.  Figure  1  also  illustrates  an  Unclassified  instance  of  the  object-type 
STUDENT,  nils  instance  has  name  Jane,  GPA  of  3.8  and  bank  account  ID  of  ppp.  Tlie  association  between  the 
name  attribute  and  its  value  is  Unclassified  while  the  associations  between  the  other  two  attributes  and  their 
respect!v.e  values  are  Secret.'* 

Other  complex  types  of  security  constraints  should  also  be  taken  into  consideration  during  the  inference 
analysis  process.  For  exiunple,  one  could  classify  the  GPA  value  at  the  Secret  level  only  if  the  student  is 
attending  ;i  military  academy.  Another  constraint  would  be  to  classify  the  GPA  value  at  the  Secret  level  and 
enforce  a  general  constraint  diat  the  teacher  attribute  of  a  student  implies  the  GPA  value.  A  tliird  constraint  would 
be  to  classify  the  association  between  the  '.^alue  of  the  GPA  attribute  tmd  the  value  of  the  name  attribute  at  the 
TopSecrct  level.  Such  constraints  should  be  taken  into  consideration  when  the  security  levels  are  assigned.  For 
example,  if  the  teacher  implies  the  GPA  (assuming  that  certain  teachers  only  teach  students  with  a  higher  GP  A 
value),  if  GPA  values  are  Secret,  then  so  should  the  teacher  values.  Olherwise,  by  inference,  an  Unclassified 
user  could  infer  the  Secret  GPA  values. 


(STUDENT)  ^ 

Name:  Jane,  U 

GPA:  3.8,  S 
Bank-account:  ppp,  S 


Figure  1  Glassification  Construct 
2A3  (fENERALIZATiON  CONSTRL'CiT 

Generoh./aiion  i.s  the  relationship  between  an  object-type  and  specialized  cases  of  tliis  object-type.  These 
specialized  'xiscs  are  called  subtypes,  if  results  in  an  object  type  Iiierarchy  also  referred  to  as  the  IS-A  hierarchy. 
An  ohjcct-ty/pc  may  h;ive  multiple  subtypes  asstxiated  with  it.  For  example,  an  object-type  PERSON  could  have 
subtypes  STUDENT  imd  EMPLOYEE.  PERSON  is  called  the  supertype  of  botli  STUDENT  and  EMPLOYEE. 
The  properties  of  PERSON  arc  inherited  by  both  STUDENT  and  EMPLOYEE. 


STUDENT 


U  .Attributes 
Name  -Type:  String. 

Level;  Level  of  instance 
GPA  *  Type:  Inteeer, 

Level:  Levet  of  instance  L  if  L  >  Secret 

_ _  Secret  if  L  <Secrct _ 

S  Attributes 

Bank-account;  Type:  String 

Level:  Level  of  inst;mcc  i, 
if  L  >  Secret 
Secret  if  L  <  Secret 


-Note,  ihal  by  the  level  of  an  .iltrihute  we  mean  the  level  ol  the  existence  of  the  attribute.  The  level  of  the  value  is  actually  the  level 
nf  the  association  between  the  .attribute  and  its  value 

^Note  that  b','  the  level  ol  the  value  of  an  attribute  we  actually  mean  the  level  of  the  association  between  the  attribute  and  its  value. 
For  examp'c.  if  the  salary  value?  is  20K,  then  it  is  meaningless  to  assign  a  level  to  20K.  A  security  level  is  assigned  to  the 
association  between  the  salary'  attribute  and  its  value  which  is  20K, 

use  the  letters  U,  C,  S.  and  TS  for  Unclassified,  Confidential  .Secret,  and  TopSecret,  respectively. 
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The  generalization  and  classification  constructs  seem  to  address  the  same  problem.  However,  the 
generalization  construct  specifies  a  "top-down"  structure,  that  is  given  a  large  set  of  objects,  how  can  they  be 
decomposed  into  smaller  sets?  The  specialization  construct  is  basically  "bottom-up",  that  is,  given  a  single  object 
what  larger  set  is  it  an  instance  of?  llus  distinction  is  important  for  inference  analysis  and  will  be  discussed  later. 

Now,  in  MKDM,  the  issue  is,  what  should  the  relationships  be  between  the  level  of  an  object-type  and  the 
level  of  its  subtype?  It  is  felt  that  subtyping  could  be  used  to  protect  the  more  sensitive  attributes.  Therefore,  the 
property  that  is  enforced  is  one  where  the  level  of  a  subtype  dominates  the  level  of  the  object-type.  This  is 
illustrated  in  figure  2,  where  the  EMPLOYEE  subtype  is  Secret  while  the  PERSON  object-type  is  Unclassified. 

If  this  is  not  the  case,  then  from  the  information  about  the  subtype,  one  could  infer  the  more  sensitive  information 
about  the  object-type.-'* 


Figure  2.  Generalization  Coastruct 

The  next  qiieshon  is,  what  about  inheritance?  It  is  assumed  that  everything  in  the  supertype  is  inherited  by  the 
subtype.  However,  if  the  level  of  the  attributes  of  the  supertype  are  lower  than  the  level  of  the  subtype  itself,  then 
these  inherited  attributes  of  the  subtype  are  assigned  the  level  of  tlie  subtype.  That  is,  if  EMPLOYEE  inherits  the 
name  attribute  of  PERSON,  the  name  attribute  of  EMPLOYEE  is  Secret.  Multiple  inheritance,  which  occurs 
when  a  subtype  has  multiple  supertypes,  has  interesting  consequences  with  respect  to  the  inference  problem.  We 
discuss  some  issues  in  section  4.^ 

2.1.4  AGGREGATION  CONSTRUCT 

The  aggregation  construct  models  the  IS-PART-OF  relationship.  For  example,  a  student  text  book  could 
consist  of  several  chapters.  Each  chapter  may  consist  of  a  collection  of  paragraphs.  The  question  is,  what  should 
be  the  relationship  between  the  level  of  an  object  and  the  level  of  its  components?  Can  a  Secret  book  have 
Unclassified  chapters  or  can  an  Unclassified  book  have  Secret  chapters?’^  A  flexible  model  should  support  both. 
However,  one  needs  to  assign  appropriate  levels  so  that  security  violations  via  inference  do  not  occur.  For 
example,  if  the  existence  of  a  chapter  is  to  be  kept  Secret,  then  one  cannot  classify  the  association  between  the 
composite  attribute  and  its  value  for  that  particular  component  chapter  at  the  Unclassified  level.^  Otherwise  the 
existence  of  the  chapter  would  be  inferred  at  the  Unclassified  level.  Figure  3  illustrates  a  composite  object  book 
whose  chapters  are  A.  B,  and  C.  The  composite  attribute  as  well  as  the  existence  of  aU  of  its  components  are 
Unclassified.  Tlie  association  between  the  attribute  and  chapters  A  and  C  is  Unclassified  while  the  association 
between  the  attribute  and  chapter  B  is  Secret.  That  is,  an  Unclassified  user  knows  that  there  are  three  chapters, 
but  he  can  only  read  chapters  A  and  C. 


-'’Note  that  nur  approach  to  the  treatment  of  classification  and  generalization  is  similar  to  some  of  the  multilevel  object-oriented  data 
models  such  as  the  ones  proposed  in  [LUNT89.  THUR91a].  Other  issues  with  generalization  include  single  inheritance  and  multiple 
inheritance.  We  adopt  the  approach  proposed  in  (SELL931. 

^Also,  it  should  be  noted  that  a  subtype  could  inherit  the  same  attribute  at  different  levels  from  different  supertypes.  Conflict 
resolution  rules  need  to  be  applied  here.  We  assume  that  the  level  of  inherited  attribute  is  the  least  upper  bound  of  the  levels 
involved. 

^Note  that  we  mean  the  existence  levels  of  the  object  and  its  components. 

^Note  that  the  value  of  the  composite  attribute  of  a  book  object  would  be  the  chapters  of  the  book.  If  the  association  between  this 
attribute  and  its  value  is  classified  at  level  L,  then  anyone  at  level  L  or  higher  could  read  the  chapters  of  the  book. 
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Figure  3,  Aggregation  Construct 

2.1.5  MEMBERSHIP  CONSTRUCT 

The  membership  coastruct  is  used  to  model  collection  objects.  A  collection  object  is  an  object  which  consists 
of  members  which  are  objects.  Each  collection  object  would  belong  to  an  object-type.  For  example,  the  object- 
type  ITEACHER,  STUDENT)  is  a  collection  object  type  whose  members  are  the  (teacher,  student)  pairs.  Tliat  is 
TEACHER  ajid  STUDENT  and  member  types  of  theUpe  (TEACHER,  STUDENT),  If  John  is  the  teacher  of 
Mary,  then  fiohn,  Mary)  would  be  an  insttince  of  the  collection  object-type  (TEACHER,  STUDENT). 

It  must  be  ensured  that  the  level  of  a  collection  object-type  must  dominate  the  levels  of  the  object-types  which 
lonn  die  coiiection.  Otlierwise,  one  could  infer  the  more  sensitive  member  object-types.  The  same  mle  applies  to 
collection  objects  also.  That  is,  the  level  of  a  collection  object  instance  must  dominate  the  levels  of  the  instances 
which  from  the  collection.  That  is,  if  the  collection  object  (John,  Mary)  is  Secret,  then  the  fact  that  John  is  a 
teacher  and  Mary  is  a  student  must  be  atmost  Secret.  In  some  cases,  one  could  classify  tlie  fact  that  (John,  x)  is 
Unclassified,  hut  (Jf)hn,  Mary)  is  Secret.  Tliis  means  that  John  teaches  someone,  but  the  object  whom  he  teaches 
is  not  known  at  the  Unclassified  level.  Suppo.se  one  were  to  have  a  constraint  tliat  the  students  of  John  should 
not  he  released  at  the  Unclassified  level,  and  if  the  collection  object  (John,  Mary)  is  Unclassified,  then  there  is  a 
security  violation  via  inference. 

2.1.6  (lENERAL  CONSTRAINTS 

General  constraints  arc  constraints  enforced  on  the  objects,  their  properties,  and  object-types  as  well  as 
constraints  enforced  on  the  relationships  between  such  constructs.  Examples  include,  "the  number  of  courses 
taken  by  a  senior  student  should  not  be  less  than  2"  or  "if  a  student's  GPA  is  less  than  2.0,  then  he  cannot  take 
course  333." 


Enforcing  such  (  onstraints  acnxss  security  levels  is  of  concern.  For  example,  the  grade  of  a  student  for 
certain  subjects  could  be  Secret  while  for  other  subjects  it  could  be  Unclassified.  So,  if  a  constraint  such  as  "if  a 
student's  GPA  is  less  than  2.0,  then  he  cannot  take  course  333"  is  enforced  at  the  Unclassified  level,  and  if  some 
of  the  grades  arc  Secret,  then  if  the  student's  GPA  is  considered  for  all  of  his  classes,  then  there  is  a  possibility 
tor  someone  at  the  Unclassified  level  to  infer  the  grades  which  arc  Secret.  If  only  the  grades  at  the  Unclassified 
level  are  taken  into  consideration,  then  there  would  not  be  an  inference  problem,  but  integrity  could  be  violated. 
For  example,  with  the  GPA  of  the  student  when  considering  only  the  Unclassified  grades  could  be  3.0  while  the 
actual  GPA  of  the  student  could  be  1.8, 

Ttiere  is  a  trade-off  between  preventing  the  inference  problem  and  maintaining  integrity  when  general 
constraints  are  enforced  across  security  levels.  In  such  a  situation,  the  constraints  must  be  considered  on  a  case 
by  case  basi.s.  In  the  case  ot  the  GPA  example,  the  consequence  of  violating  integrity  may  not  be  catastrophic. 

So,  the  inference  problem  could  be  given  higher  priority.  In  the  case  of  an  aircraft  example,  where  the  maximum 
load  carried  cannot  exc,«ed  certain  weight,  ;ind  if  the  cargo  are  classified  at  different  levels,  then  violating  integrity 
could  be  a  serious  problem  if  the  aircraft  is  to  he  able  to  function,  A  good  discussion  of  these  security  problems 
is  found  m  Marks  et  al  1MARK94]. 
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2.1.7  HEURISTICS 


An  example  of  a  heuristic  constraint  is  the  inference  rale  that  if  a  student  is  in  his  senior  year,  then  he  must 
have  taken  at  least  20  courses.  Another  example  is  that  if  a  lecturer  teaches  more  than  2  courses  a  semester,  then 
he  has  no  external  funding  to  do  research.  That  is  heuristics  are  used  to  express  common  knowledge.  In 
formulating  heuristic  constructs  one  has  to  be  careful  that  security  violations  via  inference  do  not  occur.  For 
example,  if  the  fact  that  a  student  has  taken  at  least  20  courses  is  to  be  kept  Secret,  then  one  cannot  assert  at  the 
Unclassified  level  that  a  student  is  in  his  senior  year.  Otherwise,  via  the  heuristic  constraint  listed  above,  one 
could  infer  at  the  Unclassified  level  that  the  student  must  have  taken  at  least  20  courses. 

Heuristic  constraints  are  in  general  expressed  in  the  form  of  IF-THEN  statements.  They  could  be  complex 
logical  formulas.  An  example  of  a  more  complex  constraint  is  "if  a  student  is  in  his  senior  year  or  a  student  is  in  a 
gifted  program,  then  he  must  have  taken  at  least  20  courses  or  his  GPA  must  be  at  least  3.5.  During  the  inference 
analysis  process,  the  heuristic  constraints  as  weU  as  the  security  constraints  are  examined  and  if  there  could  be 
potential  inference  problems,  then  the  constraints  are  modified  accordingly. 

2.1.8  TEMPORAL  RELATIONSHIPS 

An  exiunple  of  a  temporal  constraint  is  that  before  a  student  starts  his  thesis  he  must  finish  his  qualifying 
i'xams  and  his  oral  exams.  .Some  of  the  discussion  under  heuristics  also  applies  to  such  constraints.  For  example, 
if  the  fact  that  the  student  has  either  finished  his  qualilying  exams  or  he  has  passed  his  oral  exams  is  to  be  kept 
Secret,  then  the  fact  that  the  student  has  started  his  thesis  cannot  be  Unclassified.  If  not,  one  could  infer  the 
Secret  information  at  the  Unclassified  level. 

Inference  analysis  could  also  be  carried  out  on  the  various  activities.  For  example,  a  student  studying  for  his 
qualifying  exams  and  a  student  preparing  for  his  oral  exams  are  both  activities.  Activities  could  themselves  be 
assigned  security  levels.  Now,  one  of  these  two  activities  could  be  carried  out  at  the  Unclassified  level.  But  one 
could  enforce  the  constraint  where  a  student  carrying  out  both  activities  should  be  kept  Secret  until  the  student 
starts  his  thesis.  In  this  case,  one  of  the  two  activities  should  be  Secret  until  the  student  starts  his  thesis.  So,  if 
both  activities  are  assigned  the  Unclassified  level,  and  the  student  has  not  started  on  his  thesis,  then  during 
inference  analysis  such  a  problem  should  be  detected. 

2.1.9  SECURITY  CONSTRAINTS 

.As  slateti  earlier,  security  constraints  are  a  special  type  of  construct.  They  are  used  in  the  inference  analysis 
prcxess  during  the  modeling  of  the  application.  Tliat  is,  every  construct  is  affected  by  security  constraints.  The 
various  types  of  constraints  that  may  be  enforced  are: 

(i)  Constraints  that  classify  an  object-type,  an  object,  property,  or  value 

(ii)  Constraints  that  classify  the  value  of  an  object's  property  depending  on  the  value  of  some  property, 

(iiii  Constraints  that  classify  any  object-type,  object,  property,  or  value  depending  on  the  occurrence  of 

some  real-world  event, 

(iv)  Con.straints  that  classify  associatioas  between  collections  of  attributes  and  their  values,® 

(v)  Constraints  which  classify  activities, 

2.2  MULTILEVEL  KNOWLEDGE  DATA  LANGUAGE 

Associated  with  MKDM  described  in  tlie  previous  section  is  an  informal  specification  language  that  we  have 
developed  ctdled  Multilevel  Knowledge  Data  Language  (MKDL).  Our  objective  was  to  develop  a  language  which 
can  specify  all  of  the  constnicts  of  MKDM  as  well  as  translate  them  into  other  specification  languages  such  as 
logic  and  SQL  statements  witliout  much  difficulty.  The  language  consists  of  only  one  construct,  the  object-type 
specification.  That  is,  each  object-type  is  specified  by  an  object-type  specification  where  all  of  the  constructs 
associated  with  the  object-type,  such  as  attributes  (properties),  .subtypes,  supertypes,  constraints,  etc.,  are 
specified. 


®For  example,  the  association  between  (name,  value)  and  (GPA,  value)  pairs  is  assigned  a  level. 
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OBJECT-TYPE  object-type-name  HAS 


[INSTANCES 

{instance  1, - }] 

[ATTRIBUTES: 

[attribute-name  1:  t}T)e, 
attribute-name  2:  type, . )] 

[SUBTYPES 

[object-type-name, - }] 

[SUPERTYPES 
{object-type-name, . }] 


[AGGREGATES 

{ (component  1 ,  component  2,  . }  ] 

[MEMBERS 

[ member-type  1,  member-type2, - )] 


[GENERAL  CONSTRAINTS 
{logical  formula, - }] 

[HEURISTICS 

[logical  formula, - )] 

[SECURITY  CONSTRAINTS 
[logical  formula, . 11 

[SUCCESSORS 

{ object-type-name, . }  J 

[PREDECESSORS 

{object-type-name, . -  )] 

[CONCURRENT 

{ object-type-name, - )  ] 


END-OBJECT-TYPE 


Figure  4.  Object-Type  Specification 

The  format  of  an  object-type  specification  is  illustrated  in  figure  4.  The  object-type  specification  has  object 
instance,  attributes,  subtypes,  supertypes,  aggregates,  members,  general  constraints,  heuristics,  temporal 
relationships  specified  by  successors,  predecessors,  and  concurrents,  and  security  constraints.  Not  all  of  the 
object-type  specifications  have  entries  for  all  components  of  the  specification.  For  example,  the  object  type 
STUDENT  may  not  have  entries  for  successors,  predecessors,  or  concurrents.  An  object  type  TAKES-COURSE 
could  have  entries  for  successors  to  indicate  the  courses  to  be  taken  after  the  completion  of  the  current  courses, 
predecessors  to  indicate  the  courses  that  should  have  been  taken  before  taking  the  current  courses,  and 
concurrents  which  .specifies  the  actions,  such  as  working  on  a  project,  the  student  could  take  in  conjunction  with 
the  current  courses.’'^ 

Note  that  aggregate  construct  specifies  the  component  types  of  an  object  if  it  is  a  composite  object.  Member  construct  specifies 
the  t>T3es  of  the  members  if  the  object  type  has  members.  Also,  constraints  are  part  of  the  specification  of  the  object  type. 
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Figure  5  illustrates  the  object  type  specification  of  the  particular  object-type  STUDENT.  It  has  three  instances 
si.  s2,  and  s3.  The  attributes  are  names,  advisor,  enrollment,  and  GPA.  The  subtype  is  GRADUATE- 
STUDENT  while  the  supertype  is  PERSON.  The  general  constraint  enforced  is  "maximum  number  of  courses  in 
an  enrollment  is  four. "The  heuristic  constraint  is  "if  the  advisor  of  the  student  is  the  Dean,  then  the  student  must 
have  a  GPA  of  3.8  or  higher.  The  security  constraint  classifies  the  the  association  between  the  GPA  attribute  and 
its  value  at  the  Secret  level.  The  reason  for  such  a  security  constraint  is  to  protect  the  association  between  the 
GPA  attribute  value  and  the  name  attribute  value.  As  can  be  seen  STUDENT  has  no  specification  for  member, 
temporal,  and  aggregate  constructs. 


OBJECT-TYPE  STUDENT  HAS 

[INSTANCES 

{si,  s2,  s3, - )] 

[ATTRIBUTES: 

(Name:  String, 

Advisor:  FACULTY, 

Enrollment:  SET-OF-COURSE, 

GPA:  Real. 

- }) 

[SUBTYPES 

(GRADUATE-STUDENT, - )] 

[SUPERTYPES 
[PERSON, - }] 

[GENERAL  CONSTRAINTS 
(Maximum  number  of  courses  in 
enrollment  is  four, - |] 

[HEURISTICS 

[  If  the  advisor  of  the  student  is  the 
Dean,  then  the  student  must  have  a  GPA  of 
3.8.  or  higher, - [( 

[CLASIFICATION  CONSTRAINTS 
(student’s  GPA  is  Secret, 

. }] 

END-OBJECT-TYPE  STUDENT 


Figure  5.  Object-Type  Specification  for  STUDENT 

One  of  the  drawbacks  with  the  specification  language  that  we  described  earlier  in  this  section  is  that  all  of  the 
security  constraints  are  stated  under  the  construct  "SECURITY  CONSTRAINTS."In  fact,  some  of  these 
constraints  could  be  attached  to  the  other  constructs.  For  example,  a  particular  attribute  value  could  be  Secret  and 
this  information  could  be  attached  to  the  attribute  construct.  Tliat  is,  in  addition  to  the  type  of  the  attribute,  its 
security  level  (i.e.  its  existence  level)  and  the  a,ssociation  between  its  level  and  its  value  are  also  specified.  Such 
an  alternate  representation  language  for  OBJECT-TYPE  as  well  as  specification  for  a  STUDENT  object-type  are 
illustrated  in  figures  6  and  7,  respectively. 

As  can  be  seen,  associated  with  object-type,  instance,  attributes,  subtypes,  supertypes,  etc.  there  is  a  security 
level.  Attached  to  the  level,  one  could  also  have  explanation  as  to  why  such  a  level  is  assigned.  In  the 
specification  for  STUDENT,  there  is  no  entry  for  SECURITY  CONSTRAINTS  as  levels  are  attached  to  the 
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various  constructs.  Note  that  if  the  constraints  are  complex  logical  formulas  which  consist  of  many  subclauses, 
they  could  he  specified  under  SECURITY  CONSTRAINTS. 


OBJECT-TYPE  object-type-name  HAS 
(Level) 

[INSTANCES 

(instance  1  (Level), . }] 

[ATTRIBUTES: 

((attribute-name  1:  type,  (Level),  (value  Level)) 
(attribute-name  2:  type,  (Level),  (value-Level)) —  }] 

[SUBTYPES 

{object-type-name  (Level), - }] 

[SUPERTYPES 

{object-type-name.  (Level)  -  -  — [] 
[AGGREGATES 

{(component  1  (Level),  component  2.  (Level)  -  -  }] 
[MEMBERS 

{(member-name  (Level),  member-type  (Level),  -  -  }] 


[GENERAL  CONSTRAINTS 

(logical  formula,  (Level) - )] 

[HEURISTICS 

[logical  formula,  (Level) - }] 

[SECURITY  CONSTRAINTS 
(logical  formula,  (Level) - )] 

[SUCCESSORS 

{object-type-name,  (Level) - |] 

[PREDECESSORS 

(object-type-mune,  (Level) . )] 

[CONCURRENT 

(object-type-namc,  (Level) - )] 


END-OBJECT-TYPE 

Figure  6.  Alternate  Object-Type  Specification 
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OBJECT-TYPE  STUDENT  (U)  HAS 


[INSTANCES 

(si  (U),  s2  (U) ,  S3  (U), - }] 

[ATTRIBUTES: 

((Name:  String,  (U),  (U)) 

(Advisor:  FACULTY,  (U),  (U)) 

(Enrollment:  SET-OF-COURSE,  (U),  (U)) 
(GPA:  Real,  (U)  (S)  (protect  association 
between  GPA  and  name))  .  . . }] 

[SUBTYPES 

(GRADUATE-STUDENT  (U), - }] 

[SUPERTYPES 

(PERSON  (U), - }] 

[GENERAL  CONSTRAINTS 
[Maximum  number  of  courses  in 
enrollment  is  four  (U), - }] 

[HEURISTICS 

( If  the  advisor  of  the  student  is  the 
Dean,  then  the  student  must  have  a  GPA  of  3.8. 
or  higher  (U), - [[ 


END-OBJECT-TATE  STUDENT 


Figure  7.  Alternate  Object-Type  Specification  for  STUDENT 
2.3  GRAPHICAL  REPRESENTATION 

Since  graphical  representations  are  easier  to  understand  by  the  humans  than  written  specifications,  the 
corresponding  graphical  representation  for  the  STUDENT  object-type  specification  is  illustrated  in  figure  8.  It  is 
basically  an  extended  entity  relationship  diagram.  Each  rectangle  represents  an  entity  such  as  a  PERSON  or 
STUDENT  Each  arrow  represents  an  attribute  and  points  to  the  entity  which  is  the  type  of  the  value  of  the 
attribute.  The  security  constnunt,  the  general  constraint,  and  heuristic  are  specified  on  the  attributes.  It  should  be 
noted  that  such  a  representation  may  not  capture  complex  general  constraints,  security  constraints,  heuristics, 
members,  and  temporal  relationships.  Nevertheless,  it  can  capture  the  entities,  the  relationships  between  them 
and  a  reasonable  set  of  constraints.  We  call  the  graphical  representation  based  on  MKDM  to  be  GRAPfflCAL- 
MKDM 

Note  that  reasoning  with  the  graphical  representation  scheme  is  less  straightforward  than  reasoning  with  a 
specification  limguage.  Therefore,  in  the  modeling  process,  the  first  step  is  usually  to  represent  the  entities  of  the 
application  using  a  graphical  representation  scheme.  The  graphical  representation  is  then  translated  into  some 
specification  language.  Thai  is,  witli  MKDM,  the  first  step  is  to  u.se  GRAPHICAL-MKDM  and  represent  the 
application.  Tlien  the  representation  is  transformed  in  an  MKDL  specification. 


Figure  8,  Graphical  Representation 


3  KNOWLEDGE  TRANSFORMATION 
3.1  OVERVIEW 

As  slated  in  section  1,  different  representation  schemes  and  inference  analysis  tools  have  been  developed. 
Each  handles  a  subset  of  the  inference  types  and  is  specialized  to  particular  types  of  Inference  problems.  TTiat  is, 
the  tools  arc  heterogeneous  in  nature.  Wiile  standardization  has  been  proposed  to  handle  heterogeneity  for 
operating  systems  and  database  systems,  it  has  also  been  realized  that  vendors,  eager  to  maintain  their  advantages 
over  competitors  to  preserve  their  share  of  die  market,  are  not  going  to  abandon  their  products  and  develop  a 
single  product  such  as  an  operating  system  or  a  database  management  system.  That  is,  heterogeneity  is  here  to 
stay.  In  the  same  way,  one  can  expect  to  see  more  and  more  heterogeneous  inference  analysis  tools. 

Now,  one  way  to  hantile  heterogeneity  is  to  develop  a  uniform  model  of  the  system  which  each  vendor  can 
interface  his  product  to.  That  is,  in  the  case  of  heterogeneous  database  systems,  a  global  view  of  tlie  enviromnent 
can  be  provided.  Consequently,  transformations  are  needed  from  each  local  system  to  the  global  view  of  the 
environment.  In  the  same  way,  the  approach  that  we  have  proposed  in  this  paper  is  intended  to  give  a  global  view 
for  modeling  tlie  multilevel  database  application  and  consequently  conducting  inference  analysis.  Therefore, 
transformations  are  needed  from  the  MKDM  methodology  proposed  here  to  the  various  heterogeneous 
representation  schemes  such  as  conceptual  structures,  logic  programming  specifications,  and  SQL  specifications 
proposed  by  others.  Tliis  way  tlie  transfonnations  can  be  applied  to  obtain  the  individual  representation  schemes 
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which  mems  (hat  the  inference  analysis  tools  developed  for  the  individual  schemes  can  then  be  applied  That  is 
one  can  take  advantage  of  the  complementary  tools  that  have  already  been  developed. 

We  discuss  with  a  simple  example  the  usefulness  of  the  transformations.  Suppose  we  are  given  a  multilevel 
database  apphcation  and  the  tools  developed  by  Hinke  (which  is  applied  to  a  special  conceptud  stracture-based 
representation  c^ed  conceptual  graphs)  and  a  tool  developed  by  Binns  (which  is  applied  to  SQL  specifications). 

tire  apphcation  designer  has  difficulty  learning  about  conceptual  graphs  and  SQL,  but  he  is  famihar  with 
MKDM  methodology  So,  he  would  first  represent  the  application  in  MKDM  related  specifications,  apply  any 
inference  analysis  tools  developed  for  MKDM,  and  then  use  the  transformations  to  generate  conceptual  graph- 
based  representation  and  SQL  specifications.  Then  he  would  apply  Hinke’s  and  Binns’  tools  for  inference 
analysis.  Each  tool  would  uncover  a  different  set  of  problems  and  the  result  would  be  a  more  secure  design  of  the 
appliCtition.  ® 

This  section  describes  the  transformations  from  the  scheme  proposed  here  to  conceptual  structures  logic 
programming  specifications,  and  extended  SQL  specifications.  These  transformations  are  described  in  section 
.j.Z,  <incl 

3.2  CONCEPTUAL  STRUCTURES 

This  section  describes  how  the  specifications  in  GRAPHICAL-MKDM  can  be  transformed  into  a  conceptual 
representation.  Note  that  we  obtained  the  graphical  representation  described  in  figure  8  froni 
N«DL  specifications  for  the  STUDENT  object-type.  Therefore,  one  could  also  transfoim  MKDM  specifications 
into  a  conceptual  stmctuie-based  representation.  ^ 

Tlie  particular  conceptual  structures  that  we  will  examine  are  semantic  nets  discussed  in  [THUR901.  We 
consider  a  semantic  net  to  be  a  collection  of  nodes  connected  via  links.  The  nodes  represent  concepts,  entities 
etc.  /“^jspresm  relationships  between  them.  Our  treatment  of  semantic  nets  is  influenced  by  the  work 
reported  in  [R1CH89].  The  entities  in  GRAPHICAL-MKDM  will  transform  into  nodes  and  the  arrows  wiU 
transform  into  links  between  the  nodes.  Therefore,  much  of  the  information  in  GRAPHICAL-MKDM  (except  the 
constants)  will  be  represented  in  a  similar  manner  using  semantic  nets.  The  constraints  such  as  heuristic  and 

transfoim  into  what  we  have  caUed  constraint  nets  in 
[  1HUR90J.  For  example,  the  constraint  "if  the  advisor  is  Dean,  then  GPA  must  be  3.8  or  higher"  is  specified 
using  the  constraint  net  illustrated  in  figure  9.  g  P  c  eu 


Figure  9.  Constraint  Net 


3.3  LOGIC  PROGRAMMING 


As  stated  earlier,  while  conceptual  structures  and  graphical  representations  help  the  humans  to  capture  the 
essential  j^nts  of  the  applications  more  easily,  reasoning  with  graphical  representations  could  become  quite 
complex.  Therefore,  specification  languages  have  been  developed  to  specify  an  application  so  that  analysis  tools 
could  be  applied  on  the  specification.  One  of  of  the  popular  specification  and  reasoning  languages  that  has  been 
proposed  is  based  on  logic.  The  advantage  of  using  logic  is  that  it  can  serve  as  a  specification  language  or  a 
programming  language.  Using  logic  as  a  programming  language  enables  the  programmer  to  only  specify  the 
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application  in  logic.  Tlie  programming  system  will  conduct  the  reasoning  and  provide  the  results.  This  way,  the 
programmer  need  not  he  burdened  with  the  details  of  the  procedures. 

We  have  proposed  logic  programming  systems  for  inference  analysis  (see  for  example  [THUR89]).  The  idea 
IS  to  specify  the  application  as  a  logic  program  so  that  the  control  component  of  the  program  can  reason  and 
detect  certain  inference  problems  that  result  due  to  logical  deduction.  To  use  tools  based  on  logic  programming, 
MKDL  has  to  be  iransfonned  into  a  logic  based  specification.  The  spiecification  for  STUDENT  described  in 
figure  5  ■-rill  transfonn  into  the  logic  program  shown  in  figure  10.  From  the  first  and  last  clauss  shown  in  this 
figure,  one  can  deduce  that  the  student  si's  GPA  is  Secret.  The  program  uses  backward  chaining  to  make  this 
deduction. 


STUDENT(sl)  <- 
STUDF.NT(s2)  <- 
STUDENT(s3)  <- 

1SAJGR.4DUATE-STUDENT,  STIlDENTt  <- 
ISAtSTUDENT,  PERSON)  <- 
ATTR1BUTE(STUDENT.  Name.  String)  <-- 
ATTRIBUTE(STUDENT,  Advisor,  FACULTY)  <- 
ATTRIBUTEfSTUDENT,  Enrollmcni.  SET-OF-COURSE)  <- 
ATTRIBUTECSTUDENT,  GPA.  Real)  <  ■■• 

N  <  4  <--  NUMBERtSTUDENT,  Enrollment,  N) 

GPA(S)  >  1.8  <-  STUDENT(S)  and  ADVISORfS,  Dean) 
Level(GPA(S))  =  Secret  <-  STUDENT(S) 


Figure  10.  Logic  Programming  Specification 

3.4  EXTENDED  StJE  SPECIFICATION 

Transforming  the  object-type  specifications  into  a  language  such  as  SQL  is  highly  desirable  and  in  many 
cases  even  necessary.  This  is  because  many  of  the  MLS/DBMSs  that  exist  today  are  based  on  tlie  relational  data 
model  with  facilities  for  specifying  tlie  schemas  in  SQL.  Furthermore  SQL  is  also  an  ANSI  standard  language. 
Since  the  object  type  specification  has  complex  constructs,  it  is  not  possible  to  express  all  of  them  in  standard 
SQL.  That  is,  extensions  to  SQL  are  necessary  to  specify  the  constructs.  In  this  section  we  discuss  the  generation 
of  extended  SQL  slatemcnt.s  from  tlie  object-type  specifications." 

Each  object-tyjic  will  translate  into  a  table  (or  relation)  .specification  in  extended  SQL.  The  attributes  of  an 
object-type  will  translate  into  attributes  of  a  relation.  Since  subtypes  and  supertypes  are  not  part  of  tlie  relational 
model,  the  relations  which  correspond  to  the  subtype  and  supertypes  arc  inserted  into  the  schema  specification 
for  subtypes  and  suficrtypcs.  That  is,  SQL  lias  to  be  extended  to  include  subtype  and  supertype  relationships.  It 
shtiuld  be  noted  that  such  extensions  have  been  proposed  for  SQL  to  support  object-oriented  constructs.  This 
version  of  SQL  is  called  Object  SQL.'2  Extensions  to  SQL  are  necessary  to  specify  comstructs  such  as  general 
constraints,  heuristics,  security  constraints.  ;md  the  temporal  relationships.  One  option  is  to  specify  these  as  part 
of  the  specification  of  the  table  as  .shov/n  in  Figure  1 1 ,  But  this  would  mean  more  changes  to  SQL  as  integrity 
corrstraints  are  specified  separately  in  SQL  and  arc  not  part  of  the  table  declaration.  If  we  want  to  be  consistent 
with  the  SQL  st.andard,  then  the  constraints  have  to  be  specified  separately.  Figure  1 2  illustrates  the  extended 
SQL  specification  for  object-type  STUDENT, 


^  'The  discussion  provided  here  on  extended  .SQl.  is  preliminary  One  of  the  objectives  is  to  minimize  the  changes  to  the  SQL 
standard 


'  ^One  way  to  implement  the  subtype  relationship  in  SQL  is  to  have  a  table  for  the  supertype  and  a  table  for  the  subtype.  The 
primaiy'  key  of  the  table  for  the  supertype  will  also  the  the  primary  of  the  table  of  the  subtype.  Some  issues  have  been  discussed  in 


[SELLy?]. 
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CREATE  TABLE  table-name 


[ATTRIBUTES: 

[attribute-name  1:  type, 
attribute-name  2:  type, 

. . -1] 

[SUBTYPES 

{ table-name  1,  table-name  2. - }] 

[SUPERTYPES 

( table-name  1,  table-name  2, - }] 

[AGGREGATE 

( table-name  1,  table-name2,  -  -  -  }] 

[MEMBERS 

( table-name  1 ,  table-name  2,  — }  ] 

[GENERAL,  CONSTRAINTS 
(logical  formula,  —  -  )] 

[HEURISTICS 

i  logical  formula. - )  1 

[CLASIFICATION  CONSTRAINTS 
[logical  formula.  - }] 

[SUCCESSORS 

( table-name, . |  ] 

[PREDECESSORS 

( table-name, . ]  ] 

[CONCURRENT 

(table-name, . )  ] 

[TUPLES 

(tuple  i, - 1) 

END  CREATE  TABLE 

Figure  1 1 .  Extended  SQL  Specification 

As  c;m  he  seen,  the  object-types  correspond  lo  table  names.  The  instances  and  attributes  of  object-types 
translate  into  tuples  and  attributes  of  tables.  SUBTYPES  and  SUPERTYPES  are  extensions  to  SQL  that  have  to 
be  made  to  support  the  associated  constructs  in  MKDL.  Because  these  constructs  are  object-types,  they  can  be 
specified  by  table  names.  Tlie  member  construct  is  specified  by  a  collection  of  table  names  where  each  table  name 
in  the  collection  corresponds  to  a  member  type.  A  tuple  of  the  member  table  would  consist  of  elements  where 
each  element  is  the  primary  key  of  a  table  which  corresponds  to  a  member  type.  Similarly,  aggregation  construct 
can  also  be  specified  by  a  list  of  table  names  where  each  table  in  the  list  corresponds  to  the  type  of  a  component 
object.'^  General  coastraints,  heuristics,  and  security  constraints  are  expressed  as  formulas  and,  as  stated 
earlier,  they  could  be  specified  outside  of  table  declaration  or  within  a  table  declaration  as  shown  in  figure  12. 
Tire  temjxiral  coastruct  is  represented  by  table-names.  Here  we  assume  that  information  about  an  activity  can  be 


b^More  research  needs  to  be  done  on  specifying  and  implementing  aggregate  objects  in  the  relational  model. 
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represented  by  a  table,  Whetlier  such  a  scheme  is  sufficient  to  represent  all  of  the  temporal  constructs  is  yet  to  be 
determined.'"* 


CREATE  TABLE  STUDENT 

[ATTRIBUTES: 

(Name:  String, 

Advisor:  FACULTY, 

Enrollment:  SET-OF-COURSE, 

GPA:  Real, 

- )1 

[SUBTYPES 

(GRADUATE-STUDENT. - [J 

[SUPERTYPES 
(PERSON. - )i 

[GENERAL  CONSTRAINTS 
(Maximum  number  of  courses  in 
enrollment  is  four. . )  i 

[HEURISTICS 

(If  tlie  advisor  of  the  student  is  tlie 
Dean,  then  the  student  must  have  a  GPA  of  3.8.  or 
higher,  -  -  -  \  j 

[CLASIFiCATlON  CONSTRAINTS 
[(.student's  GPA  is  Secret,  protect  the  association 
between  the  attributes  GP.A  and  niune  ) 


(TUPLES 

(s!,s2.  s3, . .  li 

END  CREATE  TABLE  .STUDENT 


Figure  1.T  Extcncicd  SQL  Specification  for  STUDENT 


4.  INFERENCE  TYPES  AND  INFERENCE  ANAEY.SIS 

GRAPHICAL  MKDM  and  MKDL  ctinturc  sulficicnt  semantics  of  the  application  so  that  inference  tools  can  be 
developed  tor  them.  Furthennore,  they  tire  sufficiently  general  enough  so  tliat  they  can  be  transformed  into 
existing  representation  schemes  such  as  ( dnceptual  staicfurcs.  logic  programming,  and  extended  SQL.  This 
way.  existing  inference  ti  xds  could  be  applied  to  the  different  specifications.  Note  that  .some  aspects  of  inference 
antdysis  was  given  in  .section  2  when  wc  described  hypcrsemantic  data  modeling.  Tiis  .section  describes 
inference  antilysis  in  more  detail.  Before  one  applies  inference  analysis  tools,  one  needs  to  detennine  what  types 
of  inference  are  to  be  handled.  In  section  4, 1  we  will  describe  some  of  the  inference  types  that  we  are  interested 
in.  In  section  4.2  we  discuss  inference  analysis. 


‘"*Note  that  instead  of  specifying  all  of  the  security  con.straints  under  the  .SECURITY  CONSTRAINTS  construct,  as  illustrated  in 
figures  1 1  and  12,  one  could  attach  levels  with  the  tables  and  the  attributes  whenever  possible.  For  example,  if  the  attribute  GPA.  is 
.Secret,  then  next  to  this  atd'ibute,  a  label  S  could  be  attached.  Tliat  is.  extended  SQL  specifications  which  correspond  to  figures  6  and 
7  can  also  be  given. 
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4.1  INFERENCE  TYPES 


In  section  4.1.1  we  discuss  some  inference  types  that  are  more  special  to  (but  not  necessarily  limited  to)  MKDM 
and  in  section  4. 1.2  we  discuss  some  of  the  more  general  inference  types. 

4.1.1  INFERENCE  TYPES  FOR  MKDMlS 

The  Multilevel  Knowledge  Data  Model  provides  a  natural  means  to  control  some  inference  in  databases. 
Inference  problem  occurs  when  a  Low  cleared  user,  retrieving  Low  classified  data,  is  able  to  infer  High  classified 
data.  Such  inference  capabilities  are  not  part  of  the  database  mechanism,  but  instead  depend  upon  the  semantic 
and  logical  relationship  of  the  data.  For  the  inference  problem  to  occur,  data  is  accumulated  into  a  meaningful 
concept,  and  knowledge  about  the  instance  of  that  concept  is  then  applied  to  derive  additional  attributes.  For 
example,  the  object  (name,  salary)  might  be  classified  High,  but  the  two  objects  (name,  job)  and  (job,  salary) 
might  be  sufficient  to  infer  the  restricted  object  (name,  salary).  This  will  be  true  if  there  are  two  knowledge  rules, 
“each  person  has  exactly  one  job”  and  “each  job  has  exactly  one  salary”  which  hold  for  aU  individual  instances  of 
the  concept. 

Tlie  simple  classification  constraints  might  be  sufficient  to  address  the  inference  problem  if  all  instances  in 
tlie  databa.se  belonged  to  a  single  object-type,  but  problems  arise  when  an  object  inherits  from  multiple  object- 
types,  In  many  cases,  real  objects  belong  to  multiple  object  types.  For  example,  the  object-  types 
Hnspital_patient  and  Hospital_employee  may  have  some  identical  instances,  those  employees  who  are  also 
patients.  The  object-types,  however,  could  be  maintained  by  independent  departments  (admitting  and  personnel) 
witli  patient  data  being  Confidential  and  employee  data  being  Unclassified.  Those  instances  belonging  to  both 
would  have  to  be  classified  at  the  highest  of  these  levels,  i.e.  Confidential.  Raising  the  classification  of  those 
instances  means  that  the  Unclassified  users  of  Hospital_employee  can  no  longer  access  that  information.  One 
solution  to  this  problem  is  to  keep  the  object-types  separate,  without  shared  instances.  In  this  case,  those 
employee-patients  will  have  iwo  objects  with  some  information  (such  as  name,  social  security  number)  being 
duplicated. 

However,  as  the  (name,  salary)  example  illustrates,  duplicating  information  may  increase  inference  threats. 
Inference  is  frequently  accomplished  by  retrieving  objects  from  different  object-types  and  using  smaller  “pieces” 
of  these  objects  tf>  form  the  restricted  object.  Tliat  is,  name  is  retrieved  from  one  object,  and  salary  from 
another  At  no  time  does  the  user  retrieve  an  iastance  from  the  classified  object-type,  yet  such  an  instance  results 
from  combining  uncla.ssified  objects.  We  need  a  way  to  verify  that  a  classified  object-type  has  been  created.  To 
solve  this,  we  adapt  the  following  definition; 

An  object  is  an  instance  ofm  object-type  if  the  object  has  those  attributes  specified  by  the  object  type. 

(This  definition  has  been  paraphrased  in  the  real  world  as  “if  it  walks  like  a  duck,  and  it  quacks  like  a  duck,  then 
it  IS  a  duck”!) 

We  can  now  define  an  object-type  in  terms  of  its  instances.  We  will  call  such  things  virtual  objects.  A 
virtual  object  is  any  collection  of  instances  having  a  common  set  of  attributes  defined  in  database.  Virtual  objects 
will  be  the  means  by  which  we  can  keep  track  of  user’s  access  to  parts  of  restricted  objects.  We  now  discuss 
some  of  the  inference  types  for  MKDM. 

CASE  1.  Sub-types  Accumulate  to  Release  Object-type 

If  we  have  an  aggregation  construct  (IS-PART-OF),  inheritance  issues  do  not  apply.  Release  of  the  object 
does  not  necessarily  release  the  “parts”,  however  we  must  assume  that  release  of  a  “part”  will  release  some 
infonnation  about  the  object.  Such  infonnation  may  or  may  not  be  sufficient  to  compromise  the  object.  Release 
of  all  parts  is  assumed  to  release  all  of  the  ohject.  If  the  object  has  a  higher  classification  than  the  part  objects, 
then  the  object  must  be  assigned  a  threshold,  and  only  a  limited  number  of  parts  may  be  released  to  a  lower 
cleared  process.  Note  that  a  similar  argument  can  be  applied  for  the  relationship  between  an  object-type  and  its 
part  (or  component)  object-types. 


^^Our  discussion  of  inference  types  for  MKDM  is  preliminary.  We  feel  that  an  investigation  of  the  inference  problem  for  the  object- 
oriented  model  warrants  a  detailed  investigation. 
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The  next  two  inference  types  ans  related  and  deal  with  relationships  among  real  and  virtual  object-types. 
Rather  th;in  try  to  account  for  all  possible  virtual  object-types,  we  will  show  those  virtual  object-types  derivable 
from  individual  object-types  and  then  show  how  these  may  be  composed  into  additicmal  virtual  object-types.  Tire 
struchire  of  an  object-type  immediately  leads  to  a  virtual  object  structure  for  sub-types.  For  example,  let  A  be  a 
generalization  construct  consisting  of  sub-types  A|,  A2,  ...An.  Each  Ai  isa  A.  Since  Aj  isa  A,  each  instance  of  Aj 
inherits  all  the  attributes  of  A,  plus  possibly  some  additional  ones.  Instances  in  Aj  restricted  to  the  attributes 
inherited  from  .A  therefore  define  a  virtual  object-type,  denoted  Aj  I  A.  The  instances  of  Aj  I  A  have  identical 
attributes,  (those  specified  by  A)  and  belong  to  the  generalization  class  denoted  by  A.  We  can  say  that  Ai  I  A  isa 
A.  If  Aj  itself  has  subtypes,  these  will  inherit  all  of  the  Aj  attributes  as  well  as  all  of  A’s  attributes,  and  hence  may 
be  used  to  define  two  virtual  object-types.  These  particular  virtual  object-types  follow  the  transitivity  relation  as 
do  regular  sub-types. 


Transitivity. 

If  A  isa  B  and  B  isa  C.  then  A  isa  C. 

If  A  I  B  isa  B  and  B  I  C  isa  C,  then  A  I  C  isa  C. 


CASE  2,  Object-type  Releases  Virtuai  Sub-types 

Let  us  consider  the  generalization  constiuct  for  object-type  PERSON,  having  sub-types  STUDENT  and 
EMPLOYEE.  If  all  instiuices  of  PERSON  are  knov/n,  at  least  some  attributes  (those  relating  to  PERSON)  of  all 
instances  of  both  STUDENT  and  EMPLOYEE  are  also  known.  Release  of  all  instances  of  PERSON  therefore 
releases  id!  instances  m  the  virtual  sub-types  STI.JDENT !  PERSON  ;md  EfdPLOYEE  I  PERSON.  Since  sub- 
types  must  be  classified  at  least  as  high  as  the  parent  object  type,  this  does  not  directly  compromise  any 
information,  Hov/ever.  information  on  such  disclosures  must  be  maintained  since  they  may  be  combined  witli 
other  methods,  such  as  case  .1  belov/.'f'  Tliis  obsei-vation  may  be  summarized  as: 

Lemma  1 :  If  cve  retrieve  all  tlie  iastances  of  object-type  A,  and  B  isa  A,  we  will  be  able  to  infer  all  the  instances 
of  the  virtual  ob  ject-typie  B  i  A. 

CASE  3.  Instance  in  Virtual  Sub-type  Releases  Instance  in  Object-type 

Case  2  applies  to  cases  where  we  rcleatie  aU  instances  of  some  object-type.  Suppose,  however,  that  we 
release  only  one  instance,  dees  this  lead  to  an  inference  problem?  Now,  each  instance  of  a  sub-type  is  an  instance 
of  the  parent  object-type.  Instances  in  a  sub-type  inherit  all  the  attributes  of  the  parent,  so  this  is  restated  as: 

Lemma  2:  If  we  retneve  an  instance  in  object-type  A,  and  A  isa  B.  we  will  be  able  to  infer  an  instance  in  object- 
type  B. 


Tliesc  two  lemma.s  arc  utilized  in  the  foliow'ing  theorem: 

Inference  Theorem:  If  we  retrieve  instances  of  object-type  A,  we  will  be  able  to  infer  iastances  of  object-type 
B  if  B  I  A  isa  B.  ;md  B  I  A  ^  0. 

Proof:  By  lemma  i  relieving  an  instance  of  objecl-lypc  A  allows  us  to  infer  an  iastance  of  object-type 
B  i  A.  If  B  i  A  isa  B.  then  by  lemma  2,  we  can  infer  an  instance  of  B  for  each  instance  of  B  I  A,  The  instances 
found  in  B  may  not  he  unique,  however,  and  we  cannol  guarantee  tliat  all  instances  of  B  are  derivable  from 
instances  of  A. 

Informal  De.scription  of  the  Inference  Theorem:  If  we  retrieve  instances  of  object-type  A,  we  will  be 
able  !o  infer  instances  of  object-type  B  if  tlie  attributes  of  B  are  a  subset  of  the  attributes  of  A. 

Example;  Assume  tha!  a  company  has  a  classified  contract,  i.e.  the  people  working  on  contract  X  may  not  be 
retrieved  by  uncleared  database  users.  Now  assume  that  the  database  for  this  company  consists  of  object-type 
COMPANY_EMP,  containing  the  names  and  SS#  of  all  the  employees,  and  object-types  (sub-classes) 
ENGINEER,  and  SUPPORT  containing  the  appropriate  attributes.  Now  assume  a  second,  classified  object-type 


l^For  example,  there  could  be  a  case  where  one  infers  the  instances  of  a  classified  object-type  C  from  the  instances  of  the 
Unclassified  object-types  CIA  and  CIB. 
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PROJECT__X^  EMP,  containing  only  the  names  of  the  employees  working  on  project  X.  Suppose  we  retrieve 
instances  of  object-type  ENGINEER.  In  the  inference  theorem,  ENGINEER  then  corresponds  to  object-type  A 
and  PROJECT_X_EMP  corresponds  to  object-type  B.  The  only  attribute  common  to  both  objects  is  name, 
(attributes  of  B  t  A  =  {name ))  which  is  identical  to  the  attribute  set  of  object-type  B  and  hence  B  I  A  isa  B.  So 
release  of  unclassified  instances  of  object-type  ENGINEER  releases  names  of  employees  in  classified  object-type 

project_x_emp. 

The  difficulty  in  inference  control  comes  in  applying  the  inference  theorem.  Database  models  typically  do 
not  specify  all  possible  generalization  relationships.  It  may  be  possible  to  access  several  objects  and  form  a  chain 
of  inferences  to  compromise  a  classified  object.  In  the  worst  case,  all  possible  combinations  of  all  objects  must 
be  considered  as  possible  inference  paths. 

Much  of  the  current  research  in  inference  control  centers  around  finding  and  specifying  the  relationships 
between  objects,  especially  objects  that  actually  belong  to  object-ty^s  but  are  not  specified  in  the  object-type 
definition  Such  hidden  relationships  may  be  combined  to  form  an  inference  path.  The  work  of  Binns  [BINN92] 
and  Garvey  {GARV92]  address  the  relational  equivalent  of  finding  these  types  of  chains,  or  paths.  The 
conceptual  structures  used  by  Thuraisingham  [THUR90]  provide  a  graphical  method  of  defining  these 
relationships  between  objects.  Hinke  [H1NK92]  has  developed  a  knowledge  engineering  tool  designed  to  assist 
the  database  designer  in  defining  relationships  between  such  data  concepts.  Idedly,  these  methods  would  enable 
us  to  specify  all  the  generalization  relations  between  aU  possible  objects,  and  the  inference  relationships  would 
then  be  evident. 

The  use  of  the  object-oriented  paradigm  offers  certain  benefits  over  the  relational  model.  In  particular, 
arbitrary  combinations  of  attributes  need  not  be  considered.  Each  object  has  a  predefined  set  of  attributes,  and 
new  combinations  of  attributes  which  are  not  derivable  from  the  existing  predefined  sets  need  not  be  examined. 
The  question  now  becomes;  what  new,  virtual  objects  need  to  be  controlled?  Of  course,  we  must  prevent  the 
construction  of  any  high  classified  object  from  lower  classified  pieces.  Some  such  objects  will  be  explicitly 
defined  via  the  classification  constraints.  That  is,  the  classification  constraints  may  be  viewed  as  specifying  a 
classified  virtual  object  or  object-type. 

An  additional  consideration  for  controlling  deductive  inference  comes  from  considering  the  constraints. 
General  and  heuristic  constraints  are  a  means  of  specifying  connections  between  objects  that  are  not  derivable 
from  the  specified  hierarchical  structure.  They  represent  an  additional  set  of  deductive  rules  that  may  need  to  be 
handled  by  use  of  logic  programming  and  is  discussed  in  section  4.2.2.  It  is  important  to  realize,  however,  that 
the  techniques  are  not  separate  but  complement  each  other. 

4.1.2  OTHER  INFERENCE  TYPES 

Some  of  the  more  common  inference  types  that  have  been  discussed  in  the  literature  include  inference  by 
logical  deduction  and  semantic  association.  For  example  if  A  implies  B,  A  is  Unclassified,  and  B  is  Secret,  then 
there  is  an  inference  problem  through  logical  deduction.  As  another  example  if  A  and  B  are  unclassified 
individually,  but  taken  together  they  are  Secret,  then  there  is  an  inference  problem  through  semantic  association. 
Note  that  the  inference  problem  through  semantic  association  is  in  many  ways  similar  to  the  inference  problem 
that  results  from  the  aggregate  hierarchy.  A  discussion  of  some  of  the  other  inference  types  is  given  in 
[THUR9]b|. 

4.2  INFERENCE  ANALYSIS 

As  described  earlier,  inference  analysis  tools  can  be  applied  on  GRAPHICAL-MKDM,  MKDL,  or  on 
transformed  specifications  such  as  conceptual  structures,  logic  programming  specifications,  and  extended  SQL. 
We  describe  some  of  the  essential  points  in  this  section.  Note  tliat  inference  analysis  has  to  be  a  repetitive  process 
if  the  application  under  consideration  is  a  dynamic  one.  For  example,  new  entities  and  relationships  could  be 
introduced  and  the  security  levels  of  the  entities  could  also  change.  Therefore,  at  every  step,  the  various  analysis 
tools  have  to  be  applied  to  prevent  security  violations  via  inference. 

4.2.1  TOOLS  APPLIED  TO  THE  HYPERSEMANTIC  MODEL-BASED  REPRESENTATION 

This  section  describes  some  inference  analysis  tools  that  could  be  applied  on  the  hypersemantic  model-based 
representation  that  we  have  discussed  in  section  2. 
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Ccmsider  the  example  of  ati  inference  analysis  tool  which  could  be  used  on  the  specification  in  MKDL  to 
detect  certain  inference  problems.  The  security  constraints  could  be  applied  to  each  modeling  constmct  such  as 
classification,  generalization,  and  temporal  relationships.  If  it  is  detected  that  there  is  a  security  violation  via 
inference  as  discussed  in  section  2,  tlien  the  designer  is  notified  and  the  security  levels  assigned  to  the  various 
constructs  are  adjusted.  For  example,  if  the  subtype  is  assigned  a  lower  level  than  the  object-type,  then  the 
security  mle  for  the  inheritance  hierarchy  is  violated,  and  the  designer  is  notified  of  the  problem.  As  another 
example,  heuristic  rules  can  be  used  to  deduce  derived  information  and  the  security  constraints  could  be  used  to 
assign  the  security  levels  to  the  derived  information.  If  the  level  of  the  derived  information  is  higher  than  the  level 
of  tire  information  used  to  derive  this  information,  then  there  is  a  potential  for  security  violation  via  inference. 
Such  logical  inferences  could  be  detected  during  the  inference  analysis  process. 

Tools  to  handle  some  of  Uie  inference  types  discussed  in  section  4.1.1  are  quite  complex  and  require  furtlier 
research.  We  are  conducting  some  preliminary  research  toward  designing  a  tool  based  on  a  graphical  model 
which  shows  hov/  one  object  may  be  used  to  derive  another  object.  Some  of  the  issues  toward  developing  such  a 
tcxil  were  discussed  in  section  4.1.1.  Using  such  a  tool,  one  could  deduce  v/hetlier  a  higher  level  object  could  be 
derived  from  lower  level  objects. 

4.2.2  APPLYING  INFERENCE  ANALYSIS  TOOLS  ON  TRANSFORMED 
REPRESENTATIONS 

Tills  section  describes  inference  analysis  on  the  transfomied  representations.  Since  the  transformed 
repre.sentations  are  essentially  those  developed  by  others,  the  infonnation  in  this  section  is  taken  from  the 
inference  v'ork  published  by  others.  Also,  since  we  have  focussed  mainly  on  representations  ba.sed  on 
conceptual  structures,  logic  programming,  and  SQL.  we  will  describe  tlie  analysis  tools  designed  for  such 
representations. 


Reasoning  with  conceptual  structures  for  inference  prevention  has  been  described  in  [THUR90].  Some  tools 
based  on  a  similar  approach  have  been  developed  by  Garvey  et  al  1GARV92]  and  Hinke  et  al  [HINK92].  WiUi 
conceptual  stmcturcs  one  could  u.se  a  variety  of  inference  rules  for  deductions.  These  include  transitive  rule, 
distribution  mle.  and  pattern  matching.  For  example,  with  if  there  is  a  net  which  asserts  that  champion  isa  ship 
and  ship  has  a  captain,  then  one  can  deduce  through  transitive  mle  that  champion  has  a  captain.  If  one  wants  to 
protect  that  fact  that  champion  has  a  captain  at  High  and  tlie  information  in  the  net  is  Low,  then  there  an  inference 
problem.  Pattern  matching  is  one  of  the  inference  mles  used  in  semantic  nets  to  derive  new  information  from  the 
main  net  and  the  constraint  nets.  For  example,  consider  the  constraint  net  of  figure  9.  If  in  the  main  net  there  is 
an  arrow  from  STUDENT  to  DEAN,  then  one  can  conclude  that  the  .student's  GPA  must  be  3.8  or  higher. 

Logic  programming  techniques  are  used  for  inference  detection  in  the  following  manner.  As  discussed  in 
section  3.  one  specifics  the  application  as  a  logic  program.  As  new  clauses  are  added  to  the  program,  they  are 
tested  as  queries  to  see  if  there  is  an  inconsistency.  As  a  simple  example,  suppose  the  program  consists  of  the 
following  clauses. 

LcvcKX,  Secret)  <-  'TEACHERtJohn.  X) 

LcvcKMaiy-  Unclassified)  <s - 

NOT  IxweKX,  Secret)  <-  Level(X,  Unclassified) 

They  a,ssert  that  al!  tliosc  who  learn  from  John  arc  Secret  ;md  that  Mary  is  Unclassified.  Furthermore,  tlie  third 
clause  asserts  that  ;uiy  entity  which  is  Unclassified  cannot  be  Secret.  Suppose  one  wants  to  assert  that  John 
teaches  MaD'.  Tliis  is  now  tested  as  a  querv.  Tliat  is,  the  following  query 
<-  TEACHER rjolin,  Mary) 

IS  posed.  Through  backward  chaining,  a  contradiction  is  derived.  Tliis  means  that  if  John  were  to  teach  Mary, 
there  will  Ix  an  inference  problem. 

The  extended  SQL  specification  generated  could  be  used  to  apply  the  inference  analysis  tool  developed  by 
Binns  fBINN921.  Binns'  tool  uses  a  technique  called  secondary  path  analysis.  It  is  assumed  that  an  attribute  of  a 
relation  is  classified  in  order  to  protect  a  collection  of  attributes  wiiich  includes  the  classified  attribute.  For 
example,  if  the  grade  attribute  of  a  .student  is  classified  at  the  Secret  level,  it  is  assumed  that  it  is  classified  in 
order  to  protect  llie  grade  associated  with  some  other  attribute  of  the  same  relation  or  possibly  a  different  relation. 
A  graph  structure  is  used  to  represent  the  paths  between  the  various  attributes  of  relations.  Paths  are  obtained  by 
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performing  the  join  operation  between  the  relations.  The  lines  forming  the  path  are  classified  according  to  the 
classification  constraints  sjjecified  in  the  schema.  If  there  is  a  path  between  two  attributes  where  some  part  of  it  is 
Secret,  and  if  there  is  a  completely  Unclassified  path  between  the  same  two  attributes,  then  there  is  a  potential  for 
an  inference  problem.  The  tool  could  point  out  such  problems.  To  apply  Biims'  tool,  the  extended  SQL 
specifications  discussed  in  section  3  are  necessary. 

5.  RELATED  WORK 

As  stated  in  section  1,  several  proposals  on  using  conceptual  structures  for  representing  and  reasoning  about 
multilevel  database  applications  have  been  given.  We  provide  a  brief  overview  of  the  various  efforts  and  compare 
the  approach  proposed  in  this  paper  with  the  others. 

To  our  knowledge,  the  use  of  conceptual  structures  to  handle  the  inference  problem  was  first  proposed  by 
Bums  [BURN88]  and  Hinke  [HINK88].  While  Bums  proposed  the  use  of  the  entity-relationship  model, 

Hinke's  work  was  on  the  use  of  graphs  for  representing  the  application.  He  showed  how  inferences  may  be 
detected  by  traversing  alternate  paths  between  two  nodes  in  the  graph.  Further  work  on  the  use  of  conceptual 
structures  for  inference  handling  was  proposed  by  Smith  [SMIT90].  Smith  suggests  extensions  to  tlie  semantic 
data  model  discussed  in  fURBA89]  to  represent  multilevel  applications.  While  he  has  shown  how  the  the  model 
could  be  used  for  representation,  reasoning  techniques  are  not  addressed.  Thuraisingham  [THUR90]  showed 
how  conceptual  structures  such  as  semantic  nets  and  conceptual  graphs  could  be  used  to  represent  and  reason 
about  the  multilevel  database  application. 

More  recently  the  development  of  tools  have  been  reported  by  Binns,  Hinke,  and  Garvey  et  al.  Binns  has 
developed  a  tool  for  secondary  path  analysis.  As  stated  in  section  5,  this  tool  takes  as  input  SQL  specifications 
and  generates  modified  specifications.  Hinke  has  developed  a  tool  called  AERIE  which  is  based  on  conceptual 
graphs.  Garvey  et  al.  have  developed  a  tool  based  on  semantic  nets.  At  present,  CoUins  [COLL94]  is  developing 
an  inference  analysis  tool  using  CLIPS. 

While  the  approaches  described  above  have  focused  mainly  on  the  inference  problem,  a  more  general 
approach  for  multilevel  database  application  design  has  been  reported  in  [WISE91,  PERN92,  and  SELL93].  The 
approach,  particularly  in  [PERN92]  and  [SELL93],  is  not  only  to  capture  the  structural  aspects  of  the  application, 
but  also  the  dynamic  aspects  of  the  application.  The  goal  is  to  design  the  multilevel  database  and  the  automated 
system.  While  the  inference  problem  has  been  given  some  consideration,  it  is  not  the  major  focus. 

As  stated  earlier,  the  approach  proposed  in  this  paper  focusses  on  developing  a  uniform  representation 
scheme  that  can  be  transformed  into  o^er  representation  schemes  without  much  difficulty  so  that  the  inference 
analysis  tools  developed  could  be  applied  so  that  one  can  obtain  the  maximum  benefit  from  the  tools  that  are 
already  available.  One  could  also  develop  inference  analysis  tools  for  MKDM.  The  major  contribution  of  MKDM 
is  that  it  incorporates  constructs  from  data  models  as  well  as  knowledge  models.  Therefore,  it  encompasses  the 
essential  capabilities  of  the  previous  models  discussed  in  the  literature  such  as  the  ones  developed  by  Bums, 
Garvey,  Hinke.  Smith,  Thuraisingham,  and  others.  The  paper  also  gives  a  specification  language  and  a  graphical 
representation  of  the  model.  Since  MKDM  borrows  constructs  from  the  data  and  knowledge  models,  the 
translation  of  MKDM  and  MKDL  into  other  representation  schemes  such  as  conceptual  structures,  logic 
programming  specification,  and  extended  SQL  can  be  accomplished  without  much  difficulty.  In  summary,  the 
strength  of  the  approach  proposed  in  this  paper  is  its  generality.’’^ 


l^One  could  argue  that  generality  has  some  disadvantages  in  that  one  may  not  be  able  to  get  the  full  potential  of  a  single  tool.  The 
question  of  generality  vs  speciality  has  been  discussed  in  other  fields  such  as  heterogeneous  dataAnowledge  base  systems  integration 
The  ultimate  decision  would  depend  on  what  the  client  wants.  That  is,  should  one  be  able  to  use  a  collection  of  inference  analysis 
tools  in  a  reasonable  manner  or  get  the  maximum  benefit  of  one  tool?  Our  hope  is  that  this  paper  will  make  the  community  start 
thinking  about  addressing  some  of  these  issues  as  inference  analysis  tools  continue  to  develop. 
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6.  SUMMARY  AND  FUTURE  CONSIDERATIONS 


Tliis  paper  has  described  a  model  called  Multilevel  Knowledge  Data  Model  (MKDM)  and  an  associated 
specification  language  called  Multilevel  Knowledge  Data  Language  (MKDL),  MKDM  combines  constructs  both 
from  data  models  and  knowledge  models.  Because  of  this,  it  has  the  representational  power  of  semantic  data 
models,  and  the  reasoning  power  of  knowledge  models.  In  describing  the  various  constructs  of  MKDM  we  also 
showed  how  potential  security  violations  could  be  detected  during  the  modeling  process.  Next  we  described  how 
GRAPHICAL-MKDM  and  MKDL  could  be  transformed  into  other  representation  schemes  such  as  conceptual 
structures,  logic  programming  specifications,  and  extended  SQL.  Finally  we  discussed  different  inference  types 
and  how  inference  analysis  tools  could  be  applied  on  the  various  representations.  Comparison  of  the  approach 
described  in  this  paper  with  other  approaches  in  the  literature  was  also  given. 

Tliis  paper  provides  the  direction  for  modeling  the  various  entities  of  multilevel  database  application, 
capturing  the  security  semantics  of  the  application,  and  subsequently  applying  reasoning  tools  for  inference 
analysis.  Future  research  should  include  the  following; 

!  i)  Develop  inference  analysis  tools  for  MKDM.  Since  MKDM  is  based  on  an  object-oriented  model,  we 
disc  ussed  various  inference  types  that  can  be  uncovered  with  such  a  representation.  Inference  analysis  tools  to 
deteci  potential  problems  with  such  representations  are  yet  to  be  designed.  In  section  4  we  discused  various 
asiTccts  of  such  tools  with  examples.  Tcxils  based  on  generalized  algorithms  need  to  be  developed. 

i  iii  )  Develop  transtonnations  to  other  representations  sc*  that  current  inference  analysis  tcxils  can  be  applied. 

Some  of  the  techniques  lor  transforming  MKDM  constructs  into  other  representations  .such  as  conceptual 
structures,  logic  programming  specifications,  and  SQL  were  discus.sed  in  section  3.  Tools  for  such 
transformations  have  to  be  developed. 

'iiil  Tesl  the  tools  with  examples.  One  needs  to  take  a  real  world  application,  represent  it  using  GRAPHICAL- 
MKDM  mid  MKDL.  apply  the  inference  mialysis  tcxils  developed  for  MKDM,  use  transfonnation  tools  to 
generate  other  specifications,  and  subsequently  apply  inference  analysis  tools  such  as  the  ones  proposed  in 
[BINN92,  HINK92.  GARV92,  COLL94|,  Such  an  exercise  would  demonstrate  the  usefulness  as  well  as  the 
robustness  of  the  methodology  that  we  have  developed. 
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Abstract 

rhis  paper  proposes  a  model  for  a  multilevel  secure  federated  database. 
.4  federated  database  is  a  distributed  database  that  is  characterised  by  a 
high  degree  of  site  autonomy,  yet  the  sites  cooperate  on  global  transactions. 

The  proposed  model  has  three  main  features:  (1)  it  is  intended  for 
a  loosely  coupled  federation  with  almost  no  central  authority;  (2)  local 
classification  of  a  data  item  is  honoured  by  all  members  of  the  federation; 
and  (3)  a  site  can  decide  on  the  level  of  sensitivity  of  its  data  that  may  be 
sent  to  each  other  site. 

The  model  solves  the  problem  where  the  sites  are  homogeneous;  how¬ 
ever  more  work  needs  to  be  done  for  heterogeneous  sites. 

Keywords:  Security,  object-orientation,  distributed  databases 


1  Introduction 

Distributed  databases  have  some  well  known  advantages;  amongst  others, 
access  speed  and  availability  of  information  can  be  increased  since  infor¬ 
mation  can  be  stored  in  close  physical  proximity  to  where  it  is  most  often 
used.  On  (he  negative  side,  distributed  databases  are  more  complex  than 
their  centralised  counterparts. 

Few  papers  have  been  published  that  address  the  security  issues  specif¬ 
ically  relevant  to  distributed  databases.  In  a  series  of  three  papers  Thu- 
raisingham  et  al  [18,  28.  29]  have  developed  models  for  multilevel  secure 
relational  databases  that  assume  three  increasingly  complex  database  ar¬ 
chitectures.  In  the  first  paper  [18]  the  data  distribution  reflects  the  classifi¬ 
cation  of  the  data;  however,  this  database  is  not  a  true  distributed  database 
[29.  p6fi2].  The  second  paper  [28]  is  ba.sed  on  a  true  distributed  database 
and  assumes  that  the  various  dat, abases  forming  the  distributed  database 
are  homogeneous.  The  third  paper  [29]  also  uses  a  true  distributed  data¬ 
base,  but  allows  for  limited  heterogeneity:  only  the  classification  ranges 
for  data  at  various  nodes  are  restricted. 

Bull,  Gong  and  Sollins[3]  argue  that  security  in  a  federated  system 
should  be  governed  from  the  servers  and  not  by  using  conventional  access 
control  lists  or  capabilities.  They  do  not  address  multilevel  security.  Gong 
and  Qian[10]  show  that  interoperation  of  systems  can  cause  unintended 
(indirect)  access  to  information  and  prove  that  elimination  of  such  un¬ 
intended  access  for  a  simplified  case  is  NP  complete.  One  solution  is  to 
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achieve  secure  global  interoperation  “incrementally  by  composing  secure 
local  interoperation.” 

Obviously,  since  a  distributed  database  is  a  collection  of  other  data¬ 
bases,  most  of  the  results  obtained  for  secure  centralised  databases  also 
apply  in  the  case  of  distributed  databases — see  [7,  21]  for  overviews  of 
centralised  security  and  [11,  12,  13,  15,  20,  27]  for  examples  of  models  for 
(centralised)  secure  object-oriented  databases  and  [16,  17]  for  a  relational 
example. 

We  are  of  the  opinion  that  the  additional  security  requirements  posed 
by  distributed  databases  do  depend  on  the  architecture  of  the  distributed 
database,  'lb  illustrate  this,  consider  the  classification  for  such  databases 
described  by  [22]:  they  consider  (1)  the  autonomy  of  the  paticipating 
databases,  (2)  their  (geographical)  distribution,  and  (3)  whether  they  are 
homogeneous  or  heterogeneous.  .As  an  example,  consider  a  distributed 
database  where  the  participating  sites  are  homogeneous  with  little  local 
autonomy  -Such  a  database  probably  needs  very  little  more  for  security 
than  a  centralised  database  needs  (ajiart  from  a  secure  way  to  communicate 
between  the  participating  sites).  This  applies  whether  the  database  is 
geographically  distributed  or  not.  On  the  other  hand,  any  distributed 
database  that  combines  heterogeneous  participating  databases  will  require 
a  signifir.anl  amount  of  additional  security  facilities  to  co-operate  securely. 

It  would  therefore  seem  that  various  categories  of  distributed  data- 
ha.ses  warrant  investigation.  For  example,  the  papers  by  Thuraisingharn 
and  others  mentioned  earlier  [If^,  28.  29]  use  different  database  architec¬ 
tures.  Similarly,  while  the  paper  by  Varadharajan  and  Black  [31]  on  ‘Dis¬ 
tributed  Object-oriented  natabases’  does  not,  specifically  address  any  ‘dis¬ 
tributed'  aspects,  it  does  apply  t<5  what  Ozsu  and  Valduriez  [22]  call  an 
inlrgralcd  database  -  n  distributed  database  that  consists  of  homogeneous 
sites  with  little  local  autonomy  Other  papers  dealing  with  secure  cen¬ 
tralised  databases  (such  as  those  cited  earlier)  similarly  apply  to  such 
integratf'd  databases-  as  long  as  the  described  models  do  not  incur  exes- 
sively  high  communication  costs, 

’The  categories  of  secure  distributed  databases  that  have  not  received 
eriougii  attention  in  the  literature  seem  to  be  those  distributed  databases 
tiiat  provide  a  high  degree  of  local  autonomy  to  the  participating  sites 
(the  roncerii  of  this  jiaperi  and  t.lu'  various  possibilities  that  exist  for 
heti'rogeneoiis  sites.  Note  that  tlu'  lat.ter  category  includes  a  number  of 
cases,  from  those  where  only  the  security  systems  differ,  through  the  case 
wlieri'  the  data  models  are  similar  (Imt  not  identical)  to  the  distributed 
databases  where  the  dat  a  modids  need  not  he  t  he  same  at  the  participating 
sites  As  an  example,  the  sites  of  t he  model  described  in  [29]  only  differ 
in  I  h('  sen.sitivity  ranges  of  dat  a  that  are  allowed  on  each  site;  one  site 
may  for  example  contain  data  m  tlu'  range  restnried  to  top  secret,  while 
another  may  contain  information  in  the  range  unrlassifted  to  secret. 

This  paper  proposes  a  model  for  a  distributed  database  that  allows 
a  high  degree  of  local  autonomy.  To  use  the  classification  of  Ozsn  and 
Valduriez  [22,  p81],  we  are  interested  in  a  federated  database,  that  is  one 
that  does  allow  a  high  degree  of  local  autonomy,  but  where  the  sites  can 
cooperate  on  global  queries  and  transact  ions  (  compared  to  a  so  called  mul- 
tidntabase.  where  the  participating  sites  are  autonomous  but  very  loosely 
integrated).  We  assume  that  no  central  authority  exists  that  can  decide 
on  (for  example)  classification  of  data  and  clearance  of  users.  This  model 
will  he  referred  to  as  SeFD  (,‘derurr  Federated  Database). 
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The  required  site  autonomy  of  a  federated  database  makes  security 
a  fundamental  issue  of  such  a  database  since  “site  autonomy  is  achieved 
when  each  site  is  able  both  to  control  accesses  from  other  sites  to  its  own 
data  and  to  manipulate  its  data  without  being  conditioned  by  any  other 
site’’  [4,  p323]. 

SeFD  uses  multilevel  security  that  is  security  where  the  decision  whe¬ 
ther  a  subject  should  he  allowed  to  access  an  entity  is  based  on  the  clear¬ 
ance  of  the  subject  and  the  sensitivity  of  the  entity  (and  the  type  of  access 
requested).  For  example,  a  subject  is  often  allowed  to  read  the  entity  if 
the  clearance  of  the  subject  dominates  the  sensitivity  of  the  entity.  See 
section  2,2  for  more  details, 

SeFD  assumes  that  the  participating  sites  are  homogeneous  and  (more 
specifically)  object-oriented  By  homogeneity  we  mean  that  the  partici¬ 
pating  sites  use  the  same  data  and  security  models;  to  simplify  matters 
one  can  assume  that  the  DBMSs  at  the  various  sites  are  copies  of  the 
same  product.  In  particular,  clearance  and  sensitivity  levels  at  one  site 
will  directly  correspond  to  levels  ar  other  sites.  We  make  the  assump¬ 
tion  of  homogeneity  in  order  to  roncentral-e  on  the  questions  regarding 
local  autonomy — see  section  5  for  a  short  reflection  on  databases  where 
homogeneity  is  not  assumed.  The  assumption  that  the  model  should  be 
object-oriented  is  not  necessary,  but  has  been  made  because  this  work 
forms  part  of  a  more  comprehensive  project  dealing  with  secure  object- 
oriented  databases  [19,  20.  21],  Many  of  the  comments  made,  do  apply  to 
other  database  models;  however  this  will  not  be  discussed  in  the  current 
paper. 

The  next  section  contains  background  material  on  object-oriented  data¬ 
bases,  multilevel  security  and  federated  databases.  Section  3  then  consid- 
er.s  the  security  requirements  posed  by  a  federated  database,  Section  4 
describes  the  proposed  model.  'Fhis  is  followed  by  the  conclusion  includ¬ 
ing  a  short  description  of  future  research. 

2  Background 

This  section  briefly  introduces  the  concepts  that  form  the  basis  of  issues 
ii.sed  in  this  paper.  These  introductions  arc  intended  to  enable  readers  not 
familiar  with  the  concepts  to  follow  the  rest  of  the  paper.  It  is  also  intended 
to  indicate  the  particular  meanings  associated  with  terms  in  this  paper— 
'“specially  terms  that  do  not  have  a  univoral  meaning  in  the  literature.  This 
sect  ion  is  not  intended  as  a  comprehensive  treatment  of  the  concepts,  or 
used  in  defence  of  this  paper’s  vieiv  of  the  concepts;  in  particular,  where 
alternatives  exist,  these  alternatives  are  not  pointed  out.  References  are 
given  for  those  readers  who  require  a  more  comprehensive  treatment. 

2.1  Object-oriented  databases 

Object-oriented  databases  are  databases  that  use  object-oriented  concepts 
to  implement  the  database.  The  basic  unit  used  to  store  data  is  the  ob¬ 
ject.  An  object  represents  a  logically  single  entity;  it  is  an  encapsulated 
unit  consisting  of  both  the  data  {instance  variables)  and  procedural  code 
(methods)  to  manipulate  the  data.  Objects  may  only  be  accessed  by  ac¬ 
tivating  its  methods.  A  method  is  activated  by  sending  a  message  to  it. 
A  method  itself  consists  of  a  sequence  of  messages  to  be  sent  to  objects 


3 


(interleaved  with  operations  to  read  and  write  instance  variables).  Often 
reading  and  writing  of  instance  variables  are  modelled  as  messages  sent  to 
(and  replies  received  from)  instance  variable  ‘objects.’  A  database  request 
is  initiated  by  a  user  for  application  program)  sending  the  first  message; 
the  remainder  of  the  request  then  consists  of  a  sequence  of  messages  sent 
between  objects. 

Objects  are  instantiated  from  classes  -\ve  assume  that  a  clas.t  is  an 
object  itself  that  serves  as  a  template  for  the  instances  (objects)  of  that 
class.  Classes  can  be  derived  from  other  classes  by  adding  variables  and/or 
methods.  .A  class  thus  derived  is  known  as  a  subclass  of  the  other  class, 
while  the  original  class  is  known  as  the  superclass  of  the  derived  class.  The 
process  where  the  subclass  uses  the  same  declarations  for  variables  and/or 
methods  as  its  superclass  is  known  as  mhertiance. 

We  assume  that  all  data  items  in  the  model  are  objects — this  includes 
classes  and  'primitive  items  such  as  integer  variables.  This  assumption  is 
also  made  by  Smalltalk  [9]  that  serves  as  our  moded  of  object-orientation. 

See  [].  2,  14]  for  a  description  of  object-oriented  databases,  [5]  for  a 
comprehensive  treatmient  of  databases  in  general  and  [32,  33]  for  a  descrip¬ 
tion  of  the  object-oriented  paradigm. 

2.2  Multilevel  security 

In  a  secure  system,  requests  to  access  resources  are  allowed  or  disallowed 
depending  on  security  criteria.  The  possible  issuers  (originators)  of  re¬ 
quests  are  usually  referred  to  as  subjects.  The  resources  accessed  by  the 
request  are  usually  referred  to  as  objects:  however,  in  this  paper  the  term 
nit?iy  will  be  used  to  refer  to  the  target  of  a  request  and  the  term  object 
will  he  used  exclusively  in  its  object-oriented  sense. 

the  criteria  used  to  decide  whether  a  .subject  should  be  allowed  to 
access  an  entity  are  usually  divided  into  two  categories:  In  discretionary 
security,  entities  are  owned  by  spe  cific  subjects;  such  a  subject  then  has 
the'  discretionary  power  to  grant  other  subjects  access  rights  to  it,s  enti¬ 
ties  (and  to  revoke  such  rights  from  other  subjects).  Multilevel  security 
(or  mandatory  security)  refers  to  a  system  wliere  all  subjects  are  grouped 
into  categories;  similarly  ail  entities  are  grouped  into  categories,  and  then 
It,  IS  iiidicati'd  which  f  at.egories  of  users  are  allowed  to  access  entities  in 
iuiv  given  1‘arcgory.  'I'his  is  ofic'n  accomplished  by  assigning  clearance  la¬ 
bels  to  suhijects  and  srn.sri ivity  labels  to  (uililies  and  then  only  allowing 
a  siibjeei  to  access  an  entity  if  a  specific  relationship  holds  between  the 
clearanci'  of  the  subject  and  the  sensitivity  h'vel  of  the  entity.  For  example, 
a  subject  is  often  allowed  to  read  t  he  ent  ity  if  the  clearance  of  the  subject 
liominates  the  sensitivity  of  the  entity.  In  contrast,  a  subject  is  usually  al¬ 
lowed  to  write  to  the  entity  if  the  el(>araiice  of  the  subject  is  dominated  by 
the  .sensitivity  of  the  entity  Further,  access  restrictions  remain  in  place, 
even  It  the  information  is  ''ojiied  or  citherwise  nianipiilated.  In  contrast 
to  discretionary  security  wh.ere  individual  sul'jefts  have  the  authority  to 
grant  access  to  other  subjects,  security  classifications  in  a  mandatory  se¬ 
cure  system  are  determined  by  a  particular  individual  (or  group)  with  this 
responsibility  for  the  entire  database.  This  individual  (or  group)  is  often 
known  as  the  system  security  officer. 

See  [23,  pp285-286]  for  a  d  iscussion  of  discretionary  security  and  [23, 
pp329  340]  for  a  discussion  of  mnll  ilevel  security,  [21]  contains  a  descrip¬ 
tion  of  a  variety  of  apjiroaches  to  multilevel  security  currently  used  in 
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secure  object-oriented  databases. 


2.3  Federated  databases 

A  federated  database  is  a  distributed  database  that  “consists  of  component 
BBSs  [database  systems]  that  are  autonomous  yet  participate  in  a  feder¬ 
ation  to  allow  partial  and  controlled  sharing  of  their  data.  Association 
autonomy  implies  that  the  component  BBSs  have  control  over  the  data 
they  manage.  They  cooperate  to  allow  different  degrees  of  integration” 
[24.  pl89].  ^ 

SeFB  will  assume  that  the  component  databases  have  compatible  secu¬ 
rity  systems  and  that  the  component  databases  are  capable  (and  willing) 
to  exchange  the  security  information  required  for  the  operation  of  SeFB. 

We  assume  that  the  federated  database  is  homogeneous,  not  glob¬ 
ally  controlled  and  that  no  federal  schema  exists.  See  [24]  for  a  discus¬ 
sion  of  these  and  other  issues  that  exist  for  federated  databases.  The 
overview  given  by  [22,  66  -89]  gives  a  clear  positioning  of  federated  data¬ 
bases  amongst  the  possible  alternatives  for  distributed  databases.  An  in¬ 
troduction  1.0  distributed  databases  can  also  be  found  in  [6]. 

We  will  assume  that  the  local  autonomy  of  each  site  implies  that  the 
site  has  definite  rights  over  the  information  stored  at  that  site.  When  we 
say  that  a  site  ‘owns’  local  information,  we  will  refer  to  these  rights. 

It  is  possible  that  ownership  of  information  can  be  transferred;  however, 
relocation  of  information  does  not  necessarily  imply  transfer  of  ownership. 
This  will  be  dealt  with  in  detail  later. 

Also  note  that  a  federated  database  must  “be  able  to  grow  incremen¬ 
tally  and  to  operate  continuously,  with  new  sites  joining  to  existing  ones, 
without  existing  sites  to  agree  with  joining  sites  on  global  data  structures 
or  definit  ions”  [4,  p323].  This  requirement  will  be  taken  into  account  when 
SeFB  is  described. 


3  Security  requirements  of  a  federated  data¬ 
base 

As  stat.ed  earlier,  local  autonomy  is  t.he  distinguising  characteristic  of  fed- 
('rat.ed  databases.  And  t  he  site’s  ability  to  control  access  to  its  information 
is  a  fundamental  aspect,  of  local  autonomy. 

In  the  first  instance,  local  autonomy  implies  local  classification  of  local 
infoririation.  Similarly,  one  can  argiii'  that  local  autonomy  also  implies 
that  the  site  should  be  able  to  det.e.rmine  the  clearance  of  subjects  directly 
associated  with  that  sili' 

These  two  requirements  havi'  two  implications  affecting  the  site’s  par¬ 
ticipation  in  the  federation.  Firstly,  the  sit.e  s  decision  about  the  classifi- 
cat.ion  of  its  information  should  he  respected  throughout  the  federation. 
This  implies  that  no  member  of  t  he  federation  should  disclose  information 
to  a  party  t  hat  the  owner  would  not  have  disclosed  it.  to.  Secondly,  a  site 
may  not  be  equally  witling  to  share  its  information  with  all  other  sites  in 
the  federation.  This  may  be  because  a  particular  site  does  not  agree  with 
the  subject  clearance  assignment  policy  of  another  site  (and  therefore  be 
unwilling  to  share,  say,  top  secret  information  with  that  site  because  this 
site  does  not  trust  some  subjects  of  that,  site  who  have  been  assigned  top 
secret  clearances  by  that  site).  The  site  may  also  be  unwilling  to  share 
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sensitive  information  with  that  site  because  it  has  evidence  (or  suspicion) 
that  the  other  site  does  disclose  the  information  to  unacceptable  parties. 
The  reason  why  some  information  is  not  to  he  disclosed  to  a  particular 
site  may  also  be  the  different  roles  different  parties  play  in  the  federation. 
To  illustrate,  consider  a  federation  of  commercial  databases  consisting  of 
bank  and  retail  databases.  The  banks  may  be  willing  to  share  information 
with  one  another  that  they  are  not  willing  to  share  with  retailers. 

We  will  use  the  phrase  site  irustedness  of  site  A  from  the  viewpoint  of 
site  B  to  refer  to  the  maximum  sensitivity  of  information  that  site  R  is 
willing  to  share  with  site  A.  Often  we  will  abbreviate  this  to  the  irustedness 
of  site  A  when  the  identity  of  the  owner  site  is  obvious. 

This  means  that  for  any  site  S  and  any  sensitivity  level  L  a  set  of 
trusted  .sites  can  be  computed,  that  is  sites  where  site  S  is  willing  to  send 
information  with  sensitivity  level  I.  to.  The  notation  T{S,  L)  will  be  used 
to  indicate  such  a  set,  and  it  will  be  referred  to  as  the  trusted  site  set  of  S 
at  level  L. 

Secure  interoperation  always  raises  th('  issues  of  understanding  and 
enforcement:  firstly,  how  does  the  global  system  ensure  that  all  compo¬ 
nent  systems  understand  the  security  policy  the  same  and,  secondly,  what 
guarantees  can  be  provided  that  the  other  systems  will  properly  enforce 
restrictions?  In  our  case  understanding  at  tlie  technical  level  is  trivially 
solved  because  the  same  security  systems  arc  used.  Understanding  at  a 
higher  level  can  be  a  problem:  one  site  can  use  different  criteria  for  classi¬ 
fying  data  or  users  than  another  site.  I'he  solution  that  we  are  proposing 
in  this  case  is  that  data  (at  a  given  level)  is  simply  not  sent  (directly  or 
indirectly)  to  a  site  before  the  system  security  officer  at  the  sending  site 
is  convinced  that  the  receiving  side  has  an  acceptable  security  policy  for 
data  at  the  concerned  level.  Enforcement  is  ensured  because  we  assume 
that  the  various  sites  use  the  same  software.  If  this  was  not  the  case,  one 
site  will  again  need  to  convince  the  other  site  that  it  does  enforce  security 
properly  before  the  other  site  will  send  information  to  this  site. 

This  paper  is  therefore  based  on  the  following  two  fundamental  se¬ 
curity  requirements  of  a  federated  database  (in  addition  to  the  normal 
‘centralised'  requirements  that  apply  at  each  site): 

1 .  Federation  wide  ‘respect'  for  the  owner’s  limitations  on  the  treatment 
of  its  data;  and 

2.  The  ability  of  a  site  to  limit  the  sensitivity  of  information  owned  by 
it  to  be  sent  to  any  particular  site  in  the  federation. 


4  Proposed  model 

I'lie  next  section  gives  some  assumjitions  made  about  security  in  SeFD, 
in  addition  to  the  security  requirements  identified  earlier  for  federated 
databases.  .Next  ownership  of  data  and  the  related  concept  of  trusteeship 
are  considered.  This  is  followed  by  descriptions  of  cases  where  entities  are 
relocated  or  instantiated  (for  example  temporary  object  relocation,  object 
emigration  and  object  replication).  This  is  followed  by  a  treatment  of 
changes  in  the  federation. 
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4.1  Security  assumptions 

SeFD  assumes  that  the  individual  sites  are  multilevel  secure  databases 
themselves.  The  security  provided  by  SeFD  coincide  with  that  provided  by 
the  component  databases,  except  for  the  additional  requirements  described 
in  section  3. 

Using  the  taxonomy  described  in  [21],  we  make  the  following  security 
assumptions  about  SeFD: 

XI.  1  The  labels  used  to  classify  subjects  and  entities  are  partially  or¬ 
dered;  a  method  can  read  from  a  variable  if  the  clearance  of  the 
subject  dominates  (>)  the  sensitivity  of  the  variable;  a  method  can 
write  to  a  variable  if  the  clearance  of  the  sub  ject  is  dominated  by  the 
sensitivity  of  the  variable;  a  method  can  be  activated  if  the  clearance 
of  the  subject  dominates  the  sensitivity  of  the  method. 

XI. 2  SeFD  uses  existence  protection;  that  is,  the  fact  that  an  entity  ex¬ 
ists  is  considered  as  sensitive  as  the  information  represented  by  (or 
contained  by)  the  entity. 

X2.1  (Hasses  and  objects  can  be  protected  (labelled),  as  well  as  their 
methods  and  instance  variables. 

X2.2  When  an  object  is  instantiated,  it  is  labelled  according  to  rules 
specified  in  its  class;  it  can  be  relabelled  by  the  system  security  officer 
at  the  site  where  the  object  resides. 

X2.3  No  additional  restrictions  apply  except  those  restrictions  that  apply 
to  all  existence  protected  models  identified  in  [21]:  amongst  others, 
t  he  sensitivity  of  methods  and  instance  variables  dominates  that  of 
their  object;  instances  are  at  least  as  sensitive  as  their  classes  and 
subclasses  as  sensitive  as  their  superclasses. 

X3.1  The  authorisation  of  a  rne.ssage  is  determined  by  the  clearance  of 
its  primary  accessor  The  authorisation  can  be  reduced  by  the  sites 
that  participate  in  a  request  as  detailed  in  subsequent  sections. 

X3.2  Message  sensitivity  is  determined  by  the  ‘normal’  rules  eis  described 
in  [21]:  the  sensitivity  of  a  message  is  increased  whenever  it  accesses 
(reads)  a  value  more  sensitive  than  the  current  sensitivity  of  the 
message.  The  sensitivity  of  a  value  is  determined  by  the  sensitivity 
of  the  object  and  variable  that,  contains  the  value  or  the  sensitivity 
of  the  method  that  returns  the  value. 

X3.3  When  a  message  cannot  update  the  intended  entity  because  the 
contents  of  the  message  is  t.oo  sensitive,  the  request  will  be  rejected. 

Parameters  X2.3,  X3.2  and  X3.3  do  not  have  a  significant  influence  on 
SeFD  'fhe  other  parameters  will  he  used  in  the  description  of  SeFD. 

4.2  Ownership 

This  section  deals  with  the  ownership  rights  that  a  site  has  over  data 
stored  at  that  site.  Note  that  ownership  as  ii.sed  in  this  case  should  not 
he  confused  with  ownership  in  a  model  for  discretionary  security:  In  dis¬ 
cretionary  security  an  owner  is  a  subject  that  has  the  discretionary  power 
to  share  entities  owned  by  it  with  other  subjects  (see  [23,  pp285-286]  for 
example).  In  the  current  model  owner  refers  to  the  site  that  has  the  au¬ 
thority  to  decide  with  which  other  sites  the  entity  may  be  shared. 
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A  fundamental  property  of  mandatory  security  is  that  an  entity  should 
bo  as  protected  wherever  it  might  be  moved  to  in  a  system  as  it  has  been  in 
its  original  location  [21].  In  an  object-oriented  database  objects  are  contin¬ 
ually  sent  (as  parameters)  with  messages,  (Remember  we  consider  every 
data  item  in  an  object-oriented  system  to  be  an  object.)  In  a  distributed 
database,  these  objects  will  often  be  sent  across  site  borders.  Another 
possibility  for  objects  to  be  moved  across  site  borders  is  object  relocation, 
either  temporarily  for  query  optimisation  purposes,  or  permanently  when 
the  reason  for  placing  information  at  a  specific  site  changes.  A  last  pos¬ 
sibility  for  such  moves  across  site  borders  is  object  replication  at  more 
than  one  site,  in  most  cases  for  efficiency  or  reliability.  Each  of  these  three 
possibilities  wull  be  discussed  shortly. 

Before  discussing  the  individual  possibilities  for  such  object  movement, 
the  principle  to  be  used  in  these  cases  should  be  considered.  If  one  agrees 
that,  under  normal  circumstances,  the  object  should  be  as  protected  in  its 
new  location  as  it  has  been  in  its  original,  it  means  that  the  classification 
label  of  the  object  (including  labels  associated  with  any  facets  of  the  object, 
such  as  methods  and  instance  variables)  should  be  transferred  with  the 
object.  However,  this  is  not  adequate.  If  the  site  where  this  object  resided 
originally  has  not  been  willing  to  share  its  contents  with  a  site  X,  the  new 
site  where  it  resides  now  should  also  not  share  its  contents  with  site  X. 
Stated  in  terms  of  ownership:  Any  other  site  should  treat  an  entity  only 
according  to  the  wishes  of  its  owner,  by  only  allowing  subjects  access  to 
the  entity  if  the  owner  would  have  allowed  it  and  only  sharing  the  entity 
with  other  sites  if  the  owner  is  willing  to  share  it. 

In  some  ceises  ownership  may  be  transferred  to  a  new  site;  this  may  be 
the  case  when  an  object  relocates  permanently  to  another  site.  However, 
for  temporary  ‘visits’  to  other  sites,  such  as  when  a  message  includes  an 
entity  as  a  parameter,  ownership  will  not  change. 

The  following  subsections  discuss  the  various  possibilities  for  objects 
that  are  not  currently  located  at  their  owner  sites. 

4.3  Trusteeship 

Since  the  distributed  database  operates  by  sending  messages  between  the 
participat.ing  sites,  it  will  often  happen  that  the  sites  contain  information 
owned  by  another  site.  Ihiless  ownership  changes  together  with  the  trans¬ 
ferring  of  the  information,  the  receiving  site  can  only  use  the  information 
in  ways  acceptable  to  the  owner  in  other  words  the  receiving  site  acts 
as  trustee  for  any  information  receivi'd  in  this  way.  This  section  considers 
the  case  where  such  information  is  received  as  part  of  a  message  sent  to 
the  site:  subsequent  sections  deal  with  object  relocation  and  replication. 

SeFTl  assumes  that  information  received  as  part  of  a  message  never 
implies  a  transfer  of  ownership  to  the  receiving  site.  The  receiving  site  is 
thus  restricted  when  using  such  information.  Some  of  the  situations  can 
be  dealt  with  quickly  and  thosi'  will  be  addressed  first. 

Suppose  that  an  object  residing  at  site  A  wants  to  send  a  message  to 
an  object  residing  at  site  R.  However,  assume  that  sites  A  and  B  are  not 
connected,  but  both  are  connected  to  a  third  site  C.  Also  assume  that  the 
sensitivity  of  the  message  to  be  sent  is  such  that  site  A  is  willing  to  send 
it  to  site  B,  but  does  not  trust  site  C  enough  to  accept  a  message  of  this 
sensitivity.  SeFD  assumes  that  the  underlying  communication  system  is 
trusted  and  that  the  communication  system  is  able  to  route  information 
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via  a  not-so-trusted  site  in  a  secure  way.  This  can  be  done  by,  for  example, 
encrypting  the  message  when  it  is  transmitted  at  the  sending  site  such  that 
it  can  only  be  decrypted  at  its  intended  destination.  A  site  that  merely 
routes  a  message,  therefore  does  not  ‘see’  any  contents  of  the  message. 

A  related,  but  more  complex  problem  concerns  the  case  where  a  mes¬ 
sage  is  not  simply  routed  via  an  intermediate  site  but  sent  to  an  object 
residing  at  that  site  and  where  that  object  then  sends  a  message  to  a  third 
site.  Suppose,  for  example,  that  an  object  at  site  A  wants  to  send  a  top 
se.cret  message  to  an  object  at  site  B.  Suppose  further  that  site  A  does 
not  want  to  send  top  secret  information  to  site  C.  However,  suppose  that 
site  R  does  trust  site  C  enough  to  send  top  secret  information  to  it.  What 
are  the  implications  if  the  target  object  at  site  B  now  sends  a  message  to 
an  object  at  site  C,  containing  the  top  secret  information  originally  sent 
by  the  object  at  site  A?  The  conservative  (but  safe)  approach  usually  fol¬ 
lowed  by  security  models  is  to  assume  that  a  message  that  is  sent  following 
receipt  of  another  message  is  at  least  as  sensitive  as  the  received  message. 
This  means  if  the  received  message  was  not  supposed  to  be  sent  to  a  site 
C,  no  subsequent  messages  (in  the  current  request)  can  be  sent  to  site  C. 

In  order  to  discuss  possible  solutions,  we  introduce  the  term  message 
trusted  site  set  to  refer  to  the  set  of  sites  that  can  still  participate  in  the 
request.  The  message  trusted  site  set  (logically  or  physically)  accompanies 
the  message.  This  set  initially  consists  of  all  sites.  Whenever  a  site  S 
contributes  information,  all  sites  that  the  contributed  information  should 
not  be  sent  to,  are  removed  from  the  set.  Whenever  a  message  has  to 
be  sent  to  another  site,  the  .sending  site  will  check  the  set  to  determine 
whether  the  message  can  indeed  be  sent  to  that  site.  The  message  trusted 
site  set  will  also  be  referred  to  as  the  message  set  for  the  sake  of  brevity. 

7’he  message  set  need  not  be  represented  physically:  it  can  be  com¬ 
puted.  However,  SeFD  does  include  the  message  trusted  site  set  with  the 
message.  A  possible  implementation  strategy  will  be  discussed  shortly. 

This  solution  to  determine  the  message  trusted  site  set  lies  midway 
between  two  other  possibilities: 

•  On  the  one  side  the  sending  side  can  contain  enough  information 
about  other  sites  that  makes  it  unnecessary  to  consider  the  message 
trusted  site  set  dynamically,  bocau.se  a  request  will  never  be  initiated 
that  sends  information  to  an  unacceptable  site. 

This  solution  has  a  number  of  drawbacks.  Firstly,  implementation 
details  (including  objects  used  to  implement  other  objects)  are  likely 
to  be  confidential  information  in  a  federated  database  and  not  easily 
shared  with  all  other  sites.  Further,  it  is  suggested  that  one  site  is 
not  allowed  to  inform  a  second  site  about  entities  available  on  a  third 
site.  Thirdly,  it  is  against  the  spirit  of  the  encapsulation  principle  to 
make  use  of  such  encapsulated  information  at  all;  in  particular,  if  the 
implementation  of  an  object  is  changed  it  may  cause  a  ripple  effect 
throughout  the  security  system  of  the  federated  database.  Lastly,  the 
information  that  needs  to  be  transmitted  before  the  trustedness  of 
all  objects  (or  methods)  can  be  determined  seems  to  be  prohibitively 
high. 

•  On  the  other  side  the  message  trusted  set  may  not  be  included  with 
the  message,  but  rather  computed  when  required. 

A  drawback  of  this  solution  is  the  high  communication  overhead  for 
all  the  anticipated  approval  requests. 
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Note  that  the  option  followed  by  SeFD  to  include  the  message  trusted 
site  set  v/ith  the  message  need  not  incur  excessive  overhead:  a  simple 
solution  (that  will  be  revised  later)  is  to  include  a  bit  string  with  every 
message,  with  one  bit  per  site'.  A  one  can  then  indicate  that  that  site 
may  still  be  involved  in  processing,  while  a  zero  may  indicate  the  contrary. 
When  a  message  originates  from  a  primary  user  (for  example  a  human 
operator)  all  bits  are  set  to  one.  A  site  can  then  remove  any  site  from  the 
potential  contributors  by  ,|iist  setting  the  corresponding  bit  to  zero  for  any 
message  that  it  sends.  No  other  site  is  allowed  to  change  any  bit  to  a  one. 

Fhe  message  handler  of  any  site  now  needs  to  do  the  following  when  it 
receives  a  message: 

•  Start  a  method  activation  for  the  indicated  method. 

•  If  data  is  accessed  at  any  point  by  the  method  activation,  determine 
the  sensitivity  of  the  data. 

•  Determine  the  trusted  site  set  of  the  site  where  the  message  is  ex¬ 
ecuting  for  the  sensitivity  level  of  the  data  .just  accessed  (as  a  bit 
string)  and  logically  AND  tins  set  with  the  bit  string  accompanying 
the  message  to  get  a  new  message  trusted  site  set. 

•  If  the  sensitivity  level  of  the  message  does  not  dominate  the  sensitiv¬ 
ity  of  the  data  just  acce.ssed.  set  the  sensitivity  of  the  message  to  the 
least  upper  bound  of  its  current  sensitivity  and  that  of  the  accessed 
data. 

•  If  (.he  method  activation  attempts  to  write  to  an  entity  and  the  sensi¬ 
tivity  of  the  entity  does  not  dominate  the  sensitivity  of  the  message, 
abort  the  message. 

The  message  handler  descibed  above  has  to  be  a  trusted  process  on  the 
site  where  it  executes. 

We  therefore  conclude  that 

•  A  secure  communication  system  has  to  be  used  to  link  various  sites 
(implying  that  only  the  sending  and  receiving  sites  will  be  able  to 
access  the  message  contents);  and 

•  Any  message  must  carry  with  it  (logically  or  physically)  a  list  of  sites 
that  may  be  involved  in  handling  (‘executing’)  the  message  or  receive 
data  obtained  as  a  result  of  this  message. 


4.4  Temporary  object  relocation 

VMien  an  object  moves  temporarily  to  another  site  it  will  probably  happen 
in  order  to  optimise  some  query.  Obviously,  (his  does  not  mean  that 
its  •security'  should  change  the  restrictions  imposed  by  the  owner  site 
still  apply.  It  is  therefore  necessary  for  the  new  location  to  take  such 
restrictions  into  account. 

This  may  be  implemented  by  attaching  the  owner  site’s  trusted  site 
set.  for  every  sensitivity  level  to  the  relocated  object.  The  sending  site  can 
then  take  this  into  account  whenever  the  object  takes  part  in  an  exchange 
of  messages.  However,  the  trusted  site  sets  are  likely  to  be  confidential 
information  not  readily  shared  with  other  members  of  the  federation. 

’  As  one  referee  pointed  out,  a  federation  of  8000  sites  implies  an  overhead  of  approximately 
t  kilobyte  per  message — again  an  indication  that  security  can  be  costly. 
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An  alternative  implementation  that  is  more  likely  to  be  acceptable  is 
to  expect  the  objects  to  protect  themselves  while  residing  at  the  other 
site:  the  security  relevant  information  (current  message  sensitivity  level 
and  message  trusted  site  set)  now  forms  inherent  parameters  whenever 
any  method  is  accessed.  The  method  then  updates  the  message  set  of 
acceptable  locations  appropriately.  The  information  necessary  to  update 
this  set  can  be  contained  as  part  (data)  of  the  relocated  object  or  can  be 
obtained  by  this  object  from  its  owner  site.  A  combination  approach  is 
also  possible  where  the  object  may  contain  less  sensitive  sets  internally, 
but  requests  a  server  at  its  owner  site  to  update  the  message  set  for  more 
sensitive  site  sets.  This  choice  of  which  technique  will  be  used  for  a  par¬ 
ticular  object  is  left  to  the  owner  site  of  the  concerned  object.  The  owner 
site  may  also  take  additional  steps  to  protect  access  sets  included  in  the 
object,  such  as  encrypting  these  sets.  The  object  can  further  be  protected 
by  (physically)  removing  facets  that  should  not  leave  the  site  at  all. 

We  therefore  recommend  that  the  responsibility  to  update  the  message 
trusted  site  set  is  encapsulated  in  each  object  and  the  implementation 
details  of  this  action  be  left  to  the  owner  site  of  the  object. 

Note  that  there  may  be  sites  to  which  an  object  cannot  move;  see 
section  4.7  (Object  instantiation)  for  details. 


4.5  Object  emigration 

Permanent  relocation  of  an  object  will  be  referred  to  as  emigration.  When 
an  object  emigrates  to  another  site  ownership  is  likely  to  change  to  the 
new  site.  If  ownership  does  not  change,  emigration  can  be  handled  exactly 
like  temporary  relocation. 

If  ownership  does  change,  the  trusted  site  sets  of  the  old  and  new 
owners  can  be  compared;  if  the  new  owner  has  the  same  (or  a  stricter) 
view  of  other  sites  than  the  old  site,  the  object  can  be  relocated  without 
any  problems.  However,  this  process  cannot  be  automated  because  the 
sites  cannot  access  one  another’s  trusted  site  sets.  Further,  if  the  new  site 
is  willing  to  send  information  to  some  site  not  acceptable  to  the  old  site, 
the  emigration  cannot  occur. 

In  all  cases  manual  intervention  is  the  appropriate  action:  the  con¬ 
cerned  system  security  officers  should  decide  whether  the  emigration  can 
occur  and,  if  it  does  occur,  the  entity  becomes  the  property  of  the  new 
site  and  is  not  owned  by  the  old  site  in  any  way  anymore. 

Note  again  that  there  may  be  sites  to  which  an  object  cannot  emigrate; 
see  section  4.7  (Object  instantiation)  for  details. 

4.6  Object  replication 

An  object  is  replicated  if  copies  of  a  (logically)  single  object  occurs  at  more 
than  one  site  (see  for  example  [5,  p624]).  Replication  presents  a  number 
of  interesting  problems,  in  particular  propagating  an  update  from  any  one 
copy  to  all  copies  (see  for  example  [6,  p293]). 

Consider  ownership  of  such  a  replicated  object.  For  simplicity,  SeFD 
only  allows  single  ownership.  This  matches  the  idea  of  a  primary  copy  of 
the  replicated  object  (see  for  example  [5,  p630]);  the  owner  of  the  primary 
copy  then  owns  the  replicated  object  (including  all  copies).  The  security 
of  copies  of  the  replicated  object  are  then  treated  exactly  like  objects  that 
have  temporarily  been  relocated. 
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Note  that  there  may  be  sites  where  a  specific  object  cannot  be  repli¬ 
cated;  see  section  4.7  {Object  instantiation)  for  details. 

4.7  Object  instantiation 

An  object  carries  witli  it  information  about  its  class.  Since  SeFD  uses 
existence  protection,  information  such  as  which  methods  a  particular  ob¬ 
ject  support  (and  hence  which  methods  are  available  from  its  class)  are 
considered  protected  information.  The  question  that  needs  attention  is 
whether  an  object  can  be  instantiated  at  any  site,  or,  for  example,  only  at 
the  site  containing  its  clasfc 

Since  (  he  structure  of  the  class  is  only  to  be  shared  with  those  sites 
trusted  enough  by  the  site  owning  the  class,  an  object  can  clearly  only  be 
instantiated  at  sites  acceptable  to  the  owner  of  the  class.  The  set  of  such 
sites  depends  on  the  classification  level  (sensitivity)  of  the  class.  Further, 
once  instantiated,  the  object  cannot  be  relocated  to  a  site  where  it  could 
not  have  been  instantiated  in  the  first  place. 

Sel<"D  uses  the  following  mechanism  to  ensure  proper  instantiation  and 
relocation  of  objects:  lo  instantiate  a  new  object,  a  message  is  sent  to 
the  class  of  the  new  object.  The  method  carries  with  it  the  site  where 
the  instantiated  object  is  to  reside.  (As  a  default,  the  site  where  the  in¬ 
stantiation  request  originat.ed  will  lie  the  site  containing  the  new  object), 
The  class  then  verifies  (with  its  owning  site)  whether  the  object  can  be 
instantiated  at  the  requested  site,  and  instantiates  it  if  acceptable;  if  not, 
the  reqne.st  is  denied.  In  addition,  all  objects  contain  a  method  that  deter¬ 
mines  whether  any  new  site  is  an  acceptable  location  for  the  object;  before 
the  object  i.s  relocated  (moved  or  replicated)  the  site  currently  containing 
the  object  will  activate  this  message  to  verify  that  the  planned  new  site  is 
indeed  acceptable.  This  method  can  cither  contain  the  list  of  acceptable 
destinations,  or  can  check  with  the  site  containing  its  class. 

•Subclass  creation  has  a  similar  problem:  If  the  subclass  is  instantiated 
a.t  site  V  while  its  superclass  is  owned  by  site  X,  obviously  any  information 
about  the  superclass  inherited  by  the  subclass  should  be  treated  according 
to  t  he  wishes  of  site  X  The  conservative  approach  followed  by  SeFD  is  to 
allow  the  subclass  (or  its  instances)  to  be  accessed  only  by  sites  acceptable 
to  both  .site  X  and  Y.  This  is  accomplished  by  requiring  that  an  instance 
of  any  subclass  has  the  option  to  deny  a  request  to  relocate  it  to  another 
site,  [f  it  does  not  deny  the  reiinest,,  the  decision  is  made  by  the  class, 
exactly  like  the  decision  would  have  been  made  for  a  request  to  relocate 
an  instance  of  the  class. 

.All  classes  therefore  include  an  accc.ptabk  site,  method  that  is  invoked 
when  an  object  is  moved.  This  same  method  is  used  by  the  create  method 
of  the  class.  Further,  this  method  will  always  request  permission  from 
t  he  corresponding  method  of  its  siipercla.ss  before  granting  the  requested 
approval. 

4.8  Relocation  of  classes 

.SeFD  assumes  that  classes  are  objects  themselves  (the  view  held  by  Small¬ 
talk  [9]).  Therefore,  for  a  class  to  be  relocated  (temporarily  or  for  emi¬ 
gration  or  replication)  the  same  restrictions  apply  that  apply  to  object 
relocation. 
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However,  in  addition  to  the  ‘object’  restrictions,  additional  restrictions 
apply:  Every  class  C  that  is  a  subclass  of  another  class  S  carries  with  it 
information  about  the  class  S.  Such  a  subclass  will  carry  with  it  the  site 
contraints  of  its  superclass  S,  just  like  any  instance  of  S  will  carry  with 
it  such  constraints.  These  constraints  will  be  checked  in  addition  to  the 
‘object’  restrictions  mentioned  earlier. 

4.9  Changes  in  the  federation 

It  has  been  mentioned  earlier  that  federations  change:  Sites  join  and  leave 
the  federation  continually.  Further,  a  site  may  change  its  view  of  the 
trustedness  of  other  sites;  in  this  case  its  trusted  site  sets  will  change. 

The  first  form  of  change  (sites  joining  and  leaving)  can  be  a  problem 
if  bit  strings  are  used  to  represent  trusted  site  sets.  In  this  case  every  site 
has  an  inherent  number  associated:  the  position  that  it  occupies  in  the 
bit  strings.  New  sites  will  be  assigned  a  position  in  the  bit  string  as  part 
of  the  negotiation  protocol  to  join  the  federation  (negotiation  protocols 
are  used  extensively  to  cooperate  activities  in  a  federated  database — see 
[24,  pp221  222]  for  an  introduction).  All  members  will  (eventually)  be 
informed  of  the  existence  of  the  new  member.  However,  until  a  site  is 
informed  of  the  existence  of  the  new  member,  it  will  consider  the  new 
member  untrusted  for  all  sensitivity  levels  and  not  send  any  messages  to 
it.  When  a  site  is  informed  of  the  existence  of  the  new  site,  the  local  system 
security  officer  is  informed  accordingly,  after  which  the  trustedness  of  this 
site  is  determined  and  the  trusted  site  sets  modified.  Objects  that  contain 
copies  of  the  trusted  site  sets  also  have  to  be  updated.  This  will  be  dealt 
with  in  the  following  paragraph.  When  a  site  leaves  the  federation,  the 
trusted  site  sets  will  similarly  have  to  be  updated.  Note  that  an  adequate 
time  has  to  elapse  before  the  corresponding  bit  position  is  used  again, 
because  all  sites  must  be  able  to  remove  the  site  that  has  left  before  the 
bit  position  can  be  reused. 

From  the  second  form  of  change  (ie  change  of  trusted  site  sets)  the  prob¬ 
lem  arises  because  of  trusted  site  sets  that  are  part  of  objects — especially 
objects  that  are  not  (currently)  located  at  their  owning  site.  This  is  rela¬ 
tively  easily  solved  by  requiring  each  site  to  maintain  a  list  of  objects  that 
if,  owns,  residing  at  other  sites.  .Such  objects  will  then  contain  a  method  to 
update  their  stored  trusted  site  sets  that  can  be  activated  by  the  owning 
site.  I'he  same  method  will  be  u.sed  to  update  the  trusted  site  sets  in  the 
event  of  a  new  site  joining  or  leaving  the  federation.  When  trusted  site 
sets  change,  the  owning  site  will  also  be  able  to  identify  its  objects  that 
reside  at,  sites  that  are  no  longer  trusted  and  relocate  those  objects.  Note 
that  a  request  from  an  owner  to  relocate  the  object  to  the  owner  has  to  be 
accepted  by  any  site- -even  if  it  is  not  normally  willing  to  send  information 
at  the  concerned  sensitivity  level  to  the  owner  site. 

5  Conclusion 

Trusteeship  is  a  central  issue  in  a  federated  database;  often  members  of 
the  federation  handle  information  on  behalf  of  other  members.  Where  this 
occurs,  this  member  acts  as  a  trustee  for  the  information  owned  by  the 
other  member. 
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SpFD  uses  a  mechanism  where  such  wishes  are  obtained  from  the  con¬ 
cerned  object.  The  owner  of  the  object  can  decide  on  the  way  an  object 
‘knows’  the  wishes  of  its  owner — by  including  such  ‘wishes’  in  the  object 
and/or  instructing  the  object  to  obtain  such  ‘wishes’  from  the  owner.  Ob¬ 
jects  modify  the  message  trusted  site  set  to  protect  information  released 
!),'  the  object.  The  object  is  also  able  to  deny  a  request  to  relocate  it  to 
another  .site  based  on  the  constraints  imposed  by  its  owner. 

Since  trust  plays  a  central  role  in  a  federated  database,  it  is  necessary 
to  limit  the  actions  of  unworthy  trustees.  SeFD  allows  this  to  be  done 
on  a  site  by  site  basis;  each  .site  has  the  discretionary  power  to  limit  the 
sensitviiy  of  any  information  that  may  (eventually)  flow  from  this  site  to 
anv  other  specific  site  in  the  federation. 

riu'  following  questions  were  identified  during  the  design  of  SeFD  but 
not  addressed  by  the  current  paper: 

•  Sel  !h  assumes  that  the  sitc.s  of  the  federated  database  are  homoge- 
lioons,  no  provision  was  made  for  heterogeneous  sites.  As  one  exam¬ 
ple,  assume  that  the  one  site  uses  a  fully  ordered  .set,  of  classification 
labels  while  the  other  uses  a  partially  ordered  set.  It  is  clear  that  the 
one  that  uses  the  partially  ordered  set  should  accept  data  from  the 
other  without  too  mucli  of  a  problem.  However,  the  converse  does 
not  hold  because  t  here  is  no  obvious,  iiniver, sally  applicable  way  that 
t  he  [lartially  ordered  labels  can  be  translated  to  fully  ordered  labels. 

»  It  ma\'  be  possible  that  a  reejuost  cannot  continue  because  a  message 
cannoi  be  sent  tr  a  target  ohjer!,  hecaiisi'  it  resides  on  a  site  that 
is  (no  longer)  in  the  message  trusted  site  set  In  this  case  it  may 
lie  possible  to  relocate  the  object  to  a  site  that  does  still  appear  in 
(he  message  tru.sted  site  set  and  so  allow  processing  to  continue;  this 
option  neecLs  to  be  inve.stigatc'ci 

•  Iti  SeFD  differerit  aspects  of  an  object  can  be  owned  by  different 
sites,  !  ypic.ally  si  riK  l.iira!  informa,t ion  of  an  object  is  owned  by  the 
owiie.i’  of  the  object  ,s  cla.ss  (and  its  siiperclassc's),  while  the  content 
IS  'twiicd  by  the  .site  where  the  objc'c!  wa.s  instantiated  or  emigrated 
to.  However.  SefD  does  not  make  provision  for  different  aspects  of 
an  object  to  bo  owned  by  different  sites. 

•  SeFl)  assumes  that  t  he  object  is  1  lie  ba.sic  ent  ity  that  will  be  relocated 
or  replicated  no  provision  was  made  for  fragmentation. 

fhese  prtinl.s  remain  interesting  rc'searrh  questions  that,  will  receive  alten- 
lioii  m  future. 
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Abstract 

In  this  paper  we  present  an  authorization  model  for  the  protection  of  object  oriented 
databases.  The  model  supports  the  concepts  of  explicit/implicit,  positive/negative, 
and  strong/weak  authorization.  The  model  is  an  evolution  of  the  ORION  authoriza¬ 
tion  model  but  differs  from  this  in  many  respects.  In  particular,  the  semantics  of 
subject  groups  and  of  negative  authorizations  applied  to  sets  of  objects  is  different. 
As  a  consequence  also  the  implication  rules  for  the  derivation  of  authorizations  are 
different.  In  the  paper  we  present  our  authorization  model,  illustrate  the  implication 
rules  supported  by  our  model  for  tlie  derivation  of  authorizations  in  object-oriented 
systems,  and  define  overriding  rules  among  authorizations. 

1  Introduction 

Object-oriented  database  management  systems  (OODBMS)  represent  today  the  most 
important  research  and  development  direction  in  the  area  of  database  systems.  In¬ 
deed.  in  addition  to  OODBMS  directly  developed  from  the  object-oriented  paradigm, 
a  number  of  systems,  bke  relational  DBMS  (RDBMS)  and  deductive  DBMS  are  cur¬ 
rently  being  extended  with  object-oriented  capabilities.  The  most  notable  example  is 
SQL-3  [n],  the  new  standard,  currently  under  definition,  for  the  SQL  language. 

Most  attention  in  the  development  of  data  management  systems  with  object-oriented 
features  has,  however,  been  given  to  traditional  database  issues  such  as  data  modeling, 
query  languages,  query  processing,  and  schema  management.  By  contrast,  less  atten¬ 
tion  has  been  given  to  the  problem  of  access  control.  Therefore,  most  of  those  systems 
still  lack  adequate  authorization  models.  The  reason  is  that  the  various  authorization 
models,  developed  for  access  controls  in  operating  systems  and  in  data  base  manage¬ 
ment  systems,  cannot  be  directly  applied  to  OODBMS.  Indeed,  several  concepts  of 
the  object-oriented  data  model,  such  as  inheritance,  versions,  and  composite  objects, 
introduce  new  protection  requirements  which  the  traditional  authorization  models  do 
not  address.  Another  source  of  complexity  is  that  the  applications  intended  as  tar¬ 
get  for  OODBMS,  like  CSCW,  office  automation,  (DAD,  have  additional  protection 
requirements  and  policies  that  need  adequate  support  from  authorization  mechanisms. 
Temporal  authorizations  represent  an  example  of  such  a  requirement  [2]. 
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Work  in  the  area  of  authorization  models  for  object-oriented  databases  is  still  in 
a  preliminary  stage  [5].  Very  few  OODBMS,  namely  Orion  [13]  and  Iris  [1],  provide 
authorization  models  comparable  to  the  models  provided  by  current  Relational  DBMS. 
Other  OODBMS  either  do  not  provide  any  authorization  mechanism  at  alt,  or  provide 
very  low  level  capabilities.  The  GemStone  system,  for  example,  only  allows  authoriza¬ 
tions  to  be  associated  with  segments,  where  a  segment  is  the  storage  unit  for  objects 

[4]. 

The  goal  of  the  work,  presented  in  this  paper,  is  to  define  a  new  comprehensive 
authorization  model  for  OODBMS,  characterized  by  a  formal  and  sound  basis.  Indeed, 
the  complexity  of  authorization  models  for  OODBMS  requires  forma]  foundations  for 
accounting  all  aspects  of  these  models  and  as  a  basis  for  a  correct  implementation. 
Our  model  has  been  defined  as  an  evolution  of  the  Orion  authorization  model.  Our 
model,  as  the  Orion  model,  provides  the  notions  of  exphcit/implicit  authorization, 
positive/negative  authorization,  and  strong/weak  authorization.  However,  it  has  a 
number  of  important  differences  with  respect  to  the  Orion  model. 

■First,  we  take  the  approach  of  increasing  the  authorization  types  of  the  model  and 
reducing  the  number  of  authorization  objects.  The  Orion  model  is  based  on  only  four 
authorization  types,  whereas  the  number  of  authorization  objects  is  quite  high.  The 
main  drawback  of  the  Orion  model  is  that  authorization  objects  are  introduced  that  do 
not  always  correspond  to  real  database  objects.  Moreover,  the  same  authorization  type 
applied  to  different  object  types  may  have  different  semantic  meanings,  thus  making 
the  semantics  hard  to  understand  in  some  cases.  The  model  presented  here  is  therefore 
more  natural  and  can  be  easily  mapped  onto  a  user  language.  Moreover,  our  model 
allows  a  more  detailed  modeling  of  implications  among  authorization  types. 

Second,  the  Orion  model  uses  a  notion  of  user  role  that  does  not  clearly  correspond 
to  either  the  notion  of  group  nor  to  the  notion  of  role,  as  currently  supported  by 
RDBMS  or  defined  in  bterature  proposals.  Our  model  is  based  on  the  notions  of  user 
and  group  as  currently  supported  in  RDBMS.  Moreover,  we  plan  to  add  the  notion  of 
user  role  in  our  model,  with,  however,  a  clear  semantic  difference  with  respect  to  the 
notion  of  group  ' . 

A  third  difference  is  related  to  semantics  of  negative  authorizations.  Both  our  model 
and  the  Orion  model  support  negative  authorization  as  a  way  to  support  exceptions 
with  respect  to  authorizations  collectively  granted  to  a  set  of  subjects,  or  on  a  set  of 
objects.  However,  the  ORION  model  the  semantics  of  negative  authorizations  on  a 
set  of  objects  is  not  clearly  defined.  In  particular,  a  negative  authorization  on  a  set  of 
objects  does  not  denote  a  negative  authorization  on  each  object  of  the  set.  Therefore, 
it  is  not  possible  to  specify  negative  authorizations  on  a  set  of  objects  to  be  intended 
as  applicable  lo  all  objects  of  the  sets.  For  instance,  the  only  way  to  give  a  user  the 
negative  aut  horization  for  the  read  privilege  on  all  instances  of  a  class  is  give  the  user 
the  negation  to  read  the  definition  of  the  class.  However,  this  negation  is  more  than 
what  we  would  have  like  to  do  (it  docs  not  allow  to  read  the  definition  of  the  class) 
and  it  does  not  admit  exceptions  [3]. 

Finally,  other  differences  that  will  not  be  discussed  in  details  in  the  present  paper 
include  a  more  sophisticated  management  for  authorizations  on  versions  and  composite 
objects,  and  capabilities  for  decentralized  authorization  administration  [3], 

W  ith  respect  to  authorization  models  described  in  other  papers  (e.g.  [9,  1,  8]) 


^Note  that  recent  versions  of  RDBMS  support  both  the  notions  of  group  and  role. 
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our  model  differs  for  several  aspects.  First,  the  model  defined  by  Dittrich  et  Al.  [9] 
does  not  include  many  of  the  notions  that  the  Orion  model  and  our  model  include, 
namely  positive/negative  authorization,  and  strong/weak  authorization.  Indeed,  the 
model  by  Dittrich  et  Al.  only  provides  the  notion  of  explicit/implicit  authorizations; 
implicit  authorizations  in  this  model  are  derived  through  deductive  rules  defined  by 
the  users.  No  consistency  criteria  are,  however,  defined  by  Dittrich  et  Al.  for  those 
deductive  rules.  The  model  defined  for  the  Iris  system  [1]^  is  based  on  the  approach 
of  considering  methods  as  the  authorization  access  types.  Therefore,  in  that  model 
users  are  authorized  to  invoke  methods  on  objects.  The  Iris  model,  however,  does 
not  have  a  formal  basis,  and  several  questions,  especially  concerning  authorization 
administration,  are  left  open.  Moreover,  the  Iris  model  does  not  provide  the  same 
flexibility  of  our  model,  since  it  does  not  provide  all  the  various  types  of  authorizations, 
like  positive/negative  authorization  and  so  on.  The  model  defined  by  Bruggemann 
[8]  differs  with  respect  to  our  model  in  a  number  of  aspects.  First,  in  that  model 
no  complete  account  is  given  for  all  modeling  constructs  of  an  object-oriented  data 
model,  in  that  this  model  does  not  consider  composite  objects,  and  versions.  Moreover, 
no  specification  is  provided  of  the  access  types  supported  in  the  model,  whereas  our 
model  provides  a  complete  account  under  this  aspect.  Finally,  that  model  handles 
exceptions  based  on  an  explicit  ordering  given  by  the  users.  However,  it  is  not  clear  how 
this  approach  would  work  in  a  decentralized  authorization  environment.  By  contrast, 
our  model  supports  multiple-level  exceptions  based  on  the  hierarchical  composition 
of  authorization  objects,  and  of  user  groups.  Therefore,  our  model  does  not  require 
explicit  user-defined  priorities  among  exceptions. 

The  remainder  of  this  paper  is  organized  as  follows.  Section  2  describes  the  reference 
object-oriented  data  model  which  will  be  used  throughout  in  paper.  Section  3  illustrates 
our  authorization  model.  Section  4  presents  the  implication  rules  for  the  derivation  of 
authorizations.  Section  5  defines  the  authorization  state.  Section  6  illustrates  how  the 
access  control  works.  Finally,  Section  7  presents  the  conclusions. 


2  A  reference  object-oriented  data  model 

In  this  section,  we  summarize  the  main  features  of  object-oriented  data  models  by  a 
reference  model  which  will  be  used  in  the  rest  of  the  paper  for  the  discussion. 

Each  real-world  entity  is  modeled  as  an  object.  A  set  of  attributes  is  associated  to 
each  object.  An  attribute  of  an  object  may  take  on  a  single  value  or  a  set  of  values. 

Each  object  is  associated  with  a  unique  identifier  (OID)  which  is  fixed  for  the  whole 
life  of  the  object.  The  identity  of  an  object  has  an  existence  independent  of  the  values 
of  the  object  attributes. 

All  objects  which  share  the  same  set  of  attributes  are  grouped  together  in  a  higher 
level  object  called  a  class.  Each  object  belongs  to  (is  an  instance  of)  only  one  class.  A 
primitive  class  is  a  class  with  no  attributes  (e.g.,  integer,  string,  or  boolean).  The  value 
of  an  attribute  of  an  object  belongs  to  some  class.  This  class  is  called  the  domain  of 
the  attribute.  The  domain  of  an  attribute  may  be  any  class,  including  a  primitive  class. 

If  an  attribute  of  a  class  C  has  a  class  C  as  a  domain,  an  aggregation  relationship  is 
established  between  the  classes  C  and  C.  According  to  this  relationship,  the  set  of 

^Note  that  this  model  has  not  been  actually  implemented  in  the  Iris  prototype.  Therefore,  the  model 
described  in  the  paper  is  still  at  preliminary  stage. 
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classes  in  the  schema  is  organized  in  an  aggregation  hierarchy. 

Users  can  derive  a  new  class  from  an  existing  class.  The  new  class,  called  a  subclass 
of  the  existing  class,  inherits  all  the  attributes  of  the  existing  class,  called  the  superclass 
of  the  new  class.  The  instances  of  the  subclass  are  members  of  all  its  superclasses.  Users 
may  specify  additional  attributes  for  the  subclass.  A  class  may  have  more  than  one 
subclass  (multiple  inheritance).  According  to  the  subclass/superclass  relationship,  the 
classes  form  a  rooted  directed  acychc  graph,  called  inheritance  hierarchy. 

Objects  are  accessed  by  system-defined  methods  that  allow  to  read  and  write  object 
attributes,  read  and  modify  the  class  definitions,  create  and  delete  instances  and  classes, 
and  so  on. 


3  The  authorization  model 

In  this  section  we  illustrate  our  authorization  model. 

3.1  Elements  of  the  model 

3.1.1  Subjects 

Authorization  subjects  can  be  either  users  or  groups.  In  the  following,  sets  S,U,  and 
GS  denote  the  sets  of  subjects,  users,  and  groups  respectively,  where  S  =  U  U  GS.  A 
group  is  defined  as  a  set  of  other  sub  jects  (users  or  groups).  Groups  are  not  necessarily 
disjoint,  i.e.,  a  same  subject  may  belong  to  more  groups.  The  membership  relationship 
between  subjects  is  represented  by  a  graph,  called  subject  graph,  as  follows.  Each 
subject  is  represented  by  a  node.  An  arc  directed  from  subject  s,-  (group)  to  subject 
Sj  (user  or  group)  indicates  that  subject  Sj  directly  belongs  to  group  s,-.  An  example 
of  subject  graph  is  illustrated  in  Figure  1. 

Given  two  subjects  Sj  €  S.Si  €  GS,  notation  Sj  <]  s,-  (or  simply,  sj  <  s,)  indicates 
that  there  exists  an  arc  from  .s,  to  Sj  in  the  subject  graph,  i.e.,  Sj  directly  belongs  to 
group  s,.  Notation  sj  <„  s,  indicates  that  there  exist  G  S,  Sk^, . . . ,  G  GS  such 
that  .Skp  —  Sj.Sk„  =  Si  and  Sk^  <  st,  <  ■  •  •  <  Sk„.  Notation  Sj  <q  Si  indicates  equality, 
i.e.,  is  equivalent  to  Sj  =  s,-. 

If  Sj  <n  .Si  with  n  >  1  we  say  that  Sj  indirectly  belongs  to  group  s,.  A  subject  Sj 
mav  belong  to  a  group  s,-  through  several  paths  in  the  subject  graph.  In  other  words, 
the  relation  Sj  <„  s;  may  be  valid  for  different  values  of  n.  For  instance,  with  reference 
to  the  subject  graph  illustrated  in  Figure  1,  Bob  <  G4.  Bob  <  G2,  Bob  <2  G2,  Bob 
<2  G’l,  Bob  <3  G'l. 

3.1.2  Objects 

The  set  of  authorization  objects,  denoted  by  0,  is  composed  of  databases,  classes  of  the 
databases,  and  instances  of  the  classes,  i.e.,  O  =  DatabaseU  ClassU  Instance.  We  do  not 
consider  attributes  of  instances  as  objects  of  authorizations.  However,  authorizations 
for  specific  attributes  can  be  specified  on  instances. 

Objects  are  organized  into  an  object  granularity  hierarchy:  a  class  is  composed  of 
a  set  of  instances,  a  database  is  composed  of  a  set  of  classes.  The  system  in  turn 
consists  of  a  set  of  databases.  An  example  of  object  granularity  hierarchy  is  illustrated 
in  Figure  2. 
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Figure  1:  An  example  of  subject  graph 


In  the  following,  given  two  objects  o,',Oj  G  0,  notation  Oj  <i  o,-  (or  simply,  Oj  <  Oj) 
indicates  that  o,  is  a  direct  descendant  of  o;  in  the  object  granularity  hierarchy,  i.e.,  Oj 
directly  belongs  to  the  set  of  objects  represented  by  o^.  Notation  Oj  <„  Oi  indicates  that 
there  exist  Ok^ ,  o^., , . . . ,  G  0  such  that  Okg  =  Oj ,  =  o,-  and  Ok^  <  Ok^  <  ■  ■  ■  <  Ok„ . 

Notation  Oj  <o  Oj  indicates  equality,  i.e.,  is  equivalent  to  Oj  =  Oj. 

For  instance,  with  reference  to  the  object  granularity  hierarchy  of  Figure  2,  Empl 
<  Employees  <  Administration. 

Classes  are  organized  into  inheritance  hierarchies  on  the  basis  of  the  subclass/super¬ 
class  relationship.  In  the  following,  given  two  classes  o,o'  notation  o'  o  (or  simply, 
o'  -<  o)  indicates  that  o'  is  a  direct  subclass  of  o.  In  multiple  inheritance,  given 
.  0'2. . . . ,  G  Class,  notation  o'  -<  {oi ,  02, . . . ,  o„}  indicates  that  o'  is  a  direct  subclass 
of  0^.02.  ■,o„.  Notation  o'  o  indicates  that  there  exist  Oko,Ok^,  ■ .  ■  ,Ok„  G  Class 
such  t  hat  Okg  =  o',  Ok„  —  o  and  <  Ok^  -<■■■<  Ok„.  Again,  notation  o'  -<;o  o  indicates 
identity,  i.e.,  is  equivalent  to  o'  =  o. 

3.1.3  Access  modes 

The  set  of  access  modes,  denoted  by  M,  consists  of  privileges  that  users  can  exercise 
on  the  objects.  The  access  modes  applicable  to  an  object  depend  on  the  type  of  the 
object,  e.g.,  database,  class,  or  instance.  In  our  model,  the  following  access  modes  are 
considered: 

1.  M [database)  =  Access  modes  applicable  to  databases: 

(a)  read-def:  to  read  the  definition  of  the  database; 
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(b)  read:  to  read  all  objects  within  the  database; 

(c)  write-,  to  modify  all  objects  within  the  database; 

(d)  create:  to  create  a  new  class  within  the  database. 

2.  M{class)  =  Access  modes  applicable  to  classes: 

(a)  read^def:  to  read  the  definition  of  the  class; 

(b)  write-def:  to  modify  the  definition  of  the  class; 

(c)  delete^def:  to  drop  the  class; 

(d)  read:  to  read  all  instances  of  the  class; 

(e)  write:  to  modify  all  instances  of  the  class; 

(f )  r€ad{Ak):  to  read  attribute  for  all  instances  of  the  class; 

(g)  m‘ite{Ak):  to  modify  attribute  Ak  for  all  instances  of  the  class; 

(h)  create:  to  create  new  instances  of  the  class; 

(i)  delete:  to  delete  all  instances  of  the  class. 

3.  M [instance)  =  Access  modes  applicable  to  instances: 

(a)  read:  to  read  all  attributes  of  the  instance; 

(b)  write:  to  modify  all  attributes  of  the  instance; 

(c)  i-eadiAk):  to  read  attribute  A^  of  the  instance; 

(d)  write)  Ak):  to  modify  attribute  Ak  of  the  instance; 

(e)  delete:  to  delete  the  instance. 

The  readJef  access  mode  on  a  class  allows  to  read  the  definition  of  the  class.  The 
read..def  access  mode  on  a  database  allows  to  read  the  directory  of  the  database,  i.e.,  it 
allows  to  see  the  names  of  all  classes  within  the  database  and  the  inheritance  hierarchies 
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they  form.  Note,  however,  the  read.def  access  mode  on  a  database  does  not  allow  users 
to  read  the  definition  of  the  classes  contained  in  the  database. 

The  delete-def  access  mode,  which  allows  to  drop  a  class,  can  be  executed  only 
when  the  class  has  no  instances. 

The  write  privilege  on  a  database  is  very  powerful  since  it  allows  every  access  to 
the  database  and  the  objects  contained  in  it.  Therefore,  it  is  intended  for  the  use  of 
the  database  administrator. 

The  privilege  to  read  an  attribute  of  an  object  whose  value  is  the  OID  of  another 
object  allows  to  read  the  object  identifier  of  the  referenced  object,  but,  in  general,  it 
does  not  allow  to  read  the  contents  of  the  referenced  objects. 

In  order  to  ensure  that  an  access  mode  is  properly  applied,  we  define  a  function 
t  :  O  -*  {database,  class,  instance}  which  associates  with  each  object  o  £  O,  its 
type  t(o).  An  access  mode  m  £  M  is  applicable  to  an  object  o  £  O  if  and  only  if 
m.  £  M{t{6)). 


3.2  Authorizations 

Our  model  provides  both  positive  and  negative  authorizations.  A  positive  authorization 
is  used  to  specify  that  a  subject  may  exercise  an  access  mode  on  an  object.  A  negative 
authorization  is  used  to  specify  that  a  subject  is  denied  an  access  mode  on  an  object. 

The  possibility  of  specifying  both  positive  and  negative  authorizations  may  intro¬ 
duce  conflicts  due  to  the  simultaneous  presence  of  two  authorizations  that  differ  only  in 
the  sign.  In  some  cases,  conflicts  may  be  easily  resolved  by  specifying  overriding  rules 
among  authorizations.  Our  model  gives  the  grantor  of  an  authorization  the  possibility 
of  specifying  whether  the  authorization  may  be  overridden  by  other  authorizations.  To 
support  overriding,  we  distinguish  between  strong  and  weak  authorizations.  A  strong 
authorization  cannot  be  overridden,  i.e.,  it  does  not  admit  exceptions.  A  weak  au¬ 
thorization  may  be  overridden,  according  to  specified  rules,  by  other  strong  or  weak 
authorizations. 

To  formally  represent  authorizations,  we  introduce  the  following  definitions. 
Definition  1  (Authorization  Space)  The  authorization  space  ASP  is  defined  as: 
ASP  =  S  X  0  X  M  X  {-!-,  -}  X  {sf,  wk} 

where  +  indicates  positive,  -  indicates  negative,  st  indicates  strong,  and  wk  indicates 

weak. 

Definition  2  (Authorization)  An  authorization  a  £  is  a  5-tuple  (s,  o,  m,as,  at) 
where: 

.s  £  S  is  the  subject  to  whom  the  authorization  is  granted; 
o  £  O  is  the  object  to  which  the  authorization  is  referred; 
m  £  M(t{o))  is  the  access  mode; 

as  £  — }  indicates  whether  the  authorization  is  referred  to  the  access  mode  (-t-)  or 

its  negation  (  — ); 

at  £  {st,  wk}  indicates  whether  the  authorization  is  strong  (st)  or  weak  (wk). 


For  instance,  authorization  {s,o,m,+,si)  states  that  subject  s  can  exercise  access 
mode  m  on  object  o,  and  this  authorization  cannot  have  exceptions.  Authorization 
(s,o,m,—,wk)  states  that  subject  s  cannot  exercise  access  mode  m  on  object  o,  and 
this  authorization  can  have  exceptions. 

Given  an  authorization  a,  notation  s(a),  o{a),  m(a),  as(a),  at{a)  denotes  the  sub¬ 
ject,  the  object,  the  access  mode,  the  sign  (positive  or  negative),  and  the  type  (strong 
or  weak)  of  authorization  a,  respectively. 

Authorizations  specified  by  the  users  are  called  explicit.  These  authorizations  are 
grouped  into  a  strong  and  a  weak  authorization  base  as  follows. 

Definition  3  (Strong  authorization  base)  A.  strong  authorization  base  SKli  C  ASP 
is  a  set  of  explicit  authorizations  with  at{a)  =  st. 

Definition  4  (Weak  authorization  base)  A  weak  authorization  base  WAB  C  ASP 
is  a  set  of  explicit  authorizations  with  at{a)  —  wk. 

The  authorizations  specified  by  the  users  are  seen  as  a  generating  set  of  authoriza¬ 
tions.  Starting  from  these  authorizations,  other  authorizations  can  be  derived  through 
the  relationships  among  subjects,  objects,  and  access  modes.  Derivation  rules  are  given 
in  Section  4. 

We  introduce  an  implication  relationship  between  authorizations  defined  as  follows. 
Given  two  authorizations  a,  a'  E  ASP,  a  implies  a',  denoted  by  o  — >■  a',  if  a'  is  derived 
from  authorization  a  through  one  of  the  implication  rules  of  the  model.  Notation 
a  — a'  indicates  that  there  exist  no,fli,...,a„  E  ASP  such  that  ao  =  a,an  =  a'  and 
do  — »•  fli  — »  •  •  •  ^  o-n-  If  n  —  0  the  relationship  indicates  equality,  i.e.,  writing  a  — »-o  a' 
is  the  same  as  writing  a  =  a'. 

The  implication  relationship  preserves  the  authorization  sign  and  the  authorization 
type,  i.e.,  given  a,  a'  E  ASP,  if  a  a'  then  as(a')  =  as{a),at{a')  =  at{a). 

Authorizations  derived  by  applying  the  implication  rules  are  called  implicit,  as 
stated  by  the  following  definition. 

Definition  5  (Implicit  authorization)  An  implicit  authorization  is  an  authoriza¬ 
tion  a  £  ASP  such  that  a  ^  SAB  U  WAB  and  3a'  E  SAB  U  WAB  such  that 
a'  -^n  o,  n  >  1. 

Note  that  an  authorization  explicitly  contained  in  an  authorization  base  may  also 
be  derivable  from  other  authorizations  through  the  implication  relationship.  According 
to  the  previous  definition  we  consider  this  authorization  as  explicit.  In  other  words, 
we  do  not  require  the  explicit  set  of  authorizations  to  be  minimal. 


4  Implication  rules 

In  this  section  we  illustrate  the  implication  rules  of  our  model.  We  distinguish  among 
the  different  domains  along  which  we  derive  implicit  authorizations.  We  analyze  impli¬ 
cation  rules  for  subjects,  for  access  modes,  for  objects,  and  along  the  class  inheritance 
hierarchy. 

Implication  rules  can  bd  illustrated  with  a  graph  in  which  the  nodes  are  the  access 
modes.  The  set  of  nodes  is  partitioned  into  three  disjoint  subsets  according  to  the 
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type  of  object  to  which  the  access  modes  are  referred  (i.e.,  databases,  classes,  and 
instances).  The  arcs  represent  the  implication  rules  of  our  model.  The  black-colored 
arcs  represent  implications  between  positive  authorizations,  while  the  grey-colored  arcs 
represent  implications  between  negative  authorizations.  Each  arc  is  labeled  with  the 
number  of  the  corresponding  implication  rule. 

4.1  Implication  rules  for  subjects 

The  first  implication  rule  refers  to  the  propagation  of  positive  and  negative  authoriza¬ 
tions  along  the  subject  graph.  The  consideration  of  groups  of  subjects  allows  to  grant 
privileges  to  set  of  users  in  an  efficient  way. 

Rule  1  The  authorization  (negation)  of  a  group  for  a  privilege  on  an  object  implies  the 
authorization  (negation)  for  the  privilege  on  the  object  for  the  direct  members  of  the 
group.  Formally.  Vs; , Sj  £  5. VoGO,Vm£  M(<(o)),Vas£  },Vat£  {st,wk},Sj  < 

Si  :  (s,,o,  m,  as,  at)  m,  as,  at). 

Rule  ]  states  that  an  authorization  a  for  a  subject  (group)  s  propagates  to  aU  the 
subjects  which  directly  belong  to  s.  By  applying  this  rule  recursively  we  have  that  the 
authorization  of  a  group  propagates  to  all  its  members,  both  direct  and  indirect. 

For  instance,  with  reference  to  the  subject  graph  illustrated  in  Figure  1,  authoriza¬ 
tion  (Gj , Employees, read-de/,-)-,  wk)  implies  authorization  (G2, Employees, read_de/,-(-,  wk) 
which  in  turn  implies  authorization  (Bob,Employees,read-de/,-f ,  tnfc). 

4.2  Implication  rules  for  access  modes 

Implication  rules  for  access  modes  are  based  on  relationships  among  access  modes 
which  belong  to  the  same  subset  of  access  modes  (i.e.,  referred  to  the  same  type  of 
object).  Given  an  authorization  (negation)  for  an  access  mode  on  an  object,  these 
rules  allow  an  authorization  (negation)  to  be  derived  for  another  access  mode  on  the 
same  object.  Negative  authorizations  propagate,  with  respect  to  the  corresponding 
positive  authorizations,  either  in  the  same  way  or  in  the  opposite  way  (logic  negation). 

In  the  following,  notation  Setof-attr(o)  denotes  the  set  of  attributes  of  object  o. 
The  reason  for  the  implications  expressed  by  the  rules  is  of  immediate  interpretation; 
we  will  include  some  explanation  when  needed.  The  graphical  representation  of  these 
rules  is  given  in  Figure  3. 

Rule  2  The  privilege  to  modify  an  object  implies  the  privilege  to  read  the  object. 
Formally,  Vs  £  5,  Vo  £  0,'dat  £  {st,  wk}  :  (s,  o,  write,  +,  at)  (s,  o,  read,  -f ,  at). 

Rule  3  The  negation  of  the  privilege  to  read  an  object  implies  the  negation  of  the  priv¬ 
ilege  to  modify  the  object.  Formally,  Vs  £  S,\/o  E  0,'dat  £  {st,  wk}  :  (s,  o,  read,  —,at)  — »• 
(s,  o,  write.  — ,  at). 

Rule  4  The  create  privilege  on  a  database  (class)  implies  the  privilege  to  read  the 
definition  of  the  database  (class).  Formally,  Vs  £  5,  Vo  £  Database  U  Class, 'iat  £ 
{st,u;fc}  :(s,o,  create,  at)  ^  {s,o,read.def  ,+,at). 
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■  =  negative  implicaticHi  rule  n 


Figure  3:  Graphical  representation  of  the  implication  rules  for  access  modes 

Rule  4  states  that  if  a  subject  is  authorized  to  create  a  class  within  a  database,  then 
the  subject  is  implicitly  authorized  to  read  the  directory  of  the  database.  Likewise,  if 
a  subject  is  authorized  to  create  an  instance  of  a  class,  then  the  subject  is  implicitly 
authorized  to  read  the  definition  of  the  class. 

Rule  5  The  negation  of  the  privilege  to  read  the  definition  of  a  database  (class)  implies 
the  negation  of  the  create  privilege  on  the  database  (class).  Formally,  Vs  6  S,\fo  e 
Database  U  Class, 'iai  G  {sf,  wA;}  :  (s,  o,  read^def ,  — ,  at)  — »•  (s,  o,  create,  — ,  at). 

Rule  5  states  that  if  a  subject  is  denied  to  read  the  directory  of  a  database,  then 
the  subject  is  implicitly  denied  to  create  a  class  within  the  database.  Likewise,  if  a 
subject  is  denied  to  read  the  definition  of  a  class,  then  the  subject  is  implicitly  denied 
to  create  an  instance  of  the  class. 

Rule  6  The  read  privilege  on  a  database  (class)  implies  the  privilege  to  read  the 
definition  of  the  database  (class).  Formally,  Vs  G  5',Vo  6  Database  U  Class, Mat  G 
:  {s,o,read,-{-,at)  — +  (s,o,read..def  ,-\-,at). 
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According  to  Rule  6,  the  privileges  to  read  all  objects  within  a  database  and  to 
read  all  instances  of  a  class  imply,  respectively,  the  privileges  to  read  the  directory  of 
the  database  and  to  read  the  definition  of  the  class. 

Rule  7  The  negation  of  the  privilege  to  read  the  definition  of  a  database  (class)  implies 
the  negation  of  the  read  privilege  on  the  database  (class).  FormaRy,  Vs  6  5,  Vo  £ 
Database  U  Class, 'iat  £  {st,wk}  :  {s,o,readjdef  ,-,at)  ->  {s,o,read,-,at). 

According  to  Rule  7,  the  negation  of  the  privileges  to  read  the  directory  of  a 
database  and  to  read  the  definition  of  a  class  implies,  respectively,  the  negation  of 
the  privileges  to  read  all  objects  within  the  database  and  to  read  all  instances  of  the 
class. 

Rule  8  The  privilege  to  modify  the  definition  of  a  class  implies  the  privilege  to 

read  the  definition  of  the  class.  Formally,  Vs  £  S,'io  £  Class, 'dat  £  {st,wk}  : 

(s.o,  imte^def,  +  ,at)  —*  (s,  o,  read-def,  +,  at). 

Rule  9  The  negation  of  the  privilege  to  read  the  definition  of  a  class  impbes  the 
negation  of  the  privilege  to  modify  the  definition  of  the  class.  Formally,  Vs  £  5”,  Vo  £ 
Class,  yat  £  {st,  wk}  :  (s,  o,  read.def,  —,at)  — *■  (s,  o,  writeAef,  —,at). 

Rule  10  The  privilege  to  delete  the  definition  of  a  class  implies  the  privilege  to 

read  the  definition  of  the  class.  Formally,  Vs  £  5, Vo  £  Class,'iat  £  {st, tcfc}  : 

(s,  o,  delete-def,  +,  at)  (s,  o,  read^def ,  +,  at). 

Rule  11  The  negation  of  the  privilege  to  read  the  definition  of  a  class  implies  the 
negation  of  the  privilege  to  delete  the  definition  of  the  class.  Formally,  Vs  £  5,  Vo  £ 
Class.  Vfli  £  {st.wk}  :  (s,o,  read..def ,  — ,  at)  — >•  (s,  o,  delete.def ,  — ,  at). 

Rule  12  The  privilege  to  modify  an  attribute  on  a  class  (instance)  implies  the  priv¬ 
ilege  to  read  the  attribute  on  the  class  (instance).  Formally,  Vs  £  5”,  Vo  £  Class  U 
Instance.'^ Ak  £  Setof..attr{o),'iat  £  {st,wk}  : 

(s.o,  write)  Ak),  +,at)  (s,  o,  read(Ak),  A,  at). 

Rule  12  states  that  the  privileges  to  modify  an  attribute  of  an  instance  and  to 
modify  an  attribute  for  all  instances  of  a  class  imply,  respectively,  the  privileges  to 
read  the  attribute  of  the  instance  and  to  read  the  attribute  for  all  instances  of  the 
class. 

Rule  13  The  negation  of  the  privilege  to  read  an  attribute  on  a  class  (instance)  implies 
the  negation  of  the  privilege  to  modify  the  attribute  on  the  class  (instance).  Formally, 
Vs  £  5.  Vo  £  Class  U  Instance,^ Ak  £  Setof..attr(o),yat  £  {st,wk}  : 

(s,  o,  read{Ak),  —,at)  — >  (s, o,  write{Ak),  —,at). 

Rule  13  states  that  the  negation  of  the  privileges  to  read  an  attribute  of  an  instance 
and  to  read  an  attribute  for  all  instances  of  a  class  implies,  respectively,  the  negation 
of  the  privileges  to  modify  the  attribute  of  the  instance  and  to  modify  the  attribute 
for  all  instances  of  the  class. 
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Rule  14  The  write  privilege  on  a  class  (instance)  implies  the  write  privilege  for  ev¬ 
ery  attribute  on  the  class  (instance).  The  negation  of  the  write  privilege  on  an  ob¬ 
ject  propagates  in  the  same  way.  Formally,  Vs  6  S,Vo  €  Class  U  Instance,  VA^  £ 
Setof^attr(o),Vas  £  },Vat  £  {st,wk}  :  (s,o,  write,  as,  at)  — >  {s,o,write{Ak),as,at) 

Rule  15  The  read  privilege  on  a  class  (instance)  implies  the  read  privilege  for  ev¬ 
ery  attribute  on  the  class  (instance).  The  negation  of  the  read  privilege  on  an  ob¬ 
ject  propagates  in  the  same  way.  Formally,  Vs  £  S,Vo  £  Class  U  Instance,  VAk  £ 
Setof.attr(o),Vas  £  {+,  -],Vat  £  {st,  wk]  :  (s,  o,  read,  as,  at)  — ^  (s,  o,  read{Ak),  as,  at). 

Rule  16  The  delete  privilege  on  a  class  (instance)  implies  the  read  privilege  on  the 
class  (instance).  Formally,  Vs  £  S,Vo  £  Class  U  Instance,  Vat  £  {st,wk}  : 

(s.o,  delete,-!,  at)  {s,o,read,+,at). 

Rule  17  The  negation  of  the  privilege  to  read  an  attribute  on  a  class  (instance) 
implies  the  negation  of  the  delete  privilege  on  the  class  (instance).  Formally,  Vs  £ 
S,Vo  £  Class  U  Instance, VAk  £  Setof.attr(o),Vat  £  {st,wk}  :  {s,o,  read{Ak),  ~,at)  ^ 
(.s,  o,  delete.  —,  at). 

4.3  Implication  rules  for  objects 

The  implication  rules  for  objects  are  based  on  the  hierarchical  structure  of  the  set  of 
objects  (see,  for  example.  Figure  2).  Given  an  authorization  for  a  privilege  on  an  object 
o,  these  rules  allow  to  derive  an  authorization  for  the  same  or  a  different  privilege  on 
objects  contained  in  o. 

In  the  following  we  enunciate  the  implication  rules  for  objects.  The  graphical 
representation  of  these  rules  is  given  in  Figure  4.  The  arcs  of  this  graph  represent  the 
relationships  among  access  modes  applied  to  objects  bound  by  the  relationship  of  <. 

Rule  18  The  privilege  to  read  a  database  (class)  implies  the  privilege  to  read  all  the 
classes  (instances)  contained  in  it.  The  negation  of  the  read  privilege  propagates  in 
the  same  way.  Formally,  Vs  £  S,Vo,  £  Database  U  Class, Voj  £  Class  U  Instance,Vas  £ 

{  +  j  —},Vat  £  {st,  wk},  Oj  <  o;  :  (s,  o,,  read,  as,  at)  — +  (s,  oj,  read,  as,  at). 

Rule  19  The  negation  of  the  privilege  to  read  the  definition  of  a  database  implies  the 
negation  of  the  privilege  to  read  the  definition  of  every  class  of  the  database.  Formally, 
Vs  £  S.Voi  £  Database, Voj  £  Class. Vat  £  {st,wk} ,Oj  <  o,  ;  (s,Oi,read.def,—,at) 

( .s , Oj ,  read.def  ,—,at). 

Rule  20  The  privilege  to  modify  a  database  (class)  implies  the  privilege  to  modify  all 
the  classes  (instances)  contained  in  it.  The  negation  of  the  modify  privilege  propagates 
in  the  same  way.  Formally,  Vs  6  S,Vo,  G  DatabaseU Class,  Voj  £  ClassU Instance,  V as  £ 
{+)  -},  Vat  G  {st,  wk},Oj  <  Oi  :  [s,  o,-,  write,  as,  at)  [s,  Oj,  write,  as,  at). 

Rule  21  The  write  privilege  on  a  database  implies  the  delete  privilege  on  each  class 
of  the  database.  The  negation  of  the  write  privilege  propagates  in  the  same  way. 
Formally,  Vs  £  S,Voi  £  Database,Voj  £  Class, Vas  £  {+,-}, Vat  G  {st,wk},Oj  <  Oi  ; 
{s,  Oi,  write,  as,  at)  -+  (s,  oj,  delete,  as,  at). 
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=  positive  implication  rule  n 
=  negative  implication  rule  n 


Figure  4:  Graphical  representation  of  the  implication  rules  for  objects 

Rule  22  The  write  privilege  on  a  database  implies  the  privilege  to  modify  the  defi¬ 
nition  of  each  class  of  the  database.  The  negation  of  the  write  privilege  propagates 
in  the  same  way.  Formally,  Vs  G  5', Vo,  G  Database ^Oj  G  Class, Vas  G  {+,-}, Vat  G 
{st,ivk}.Oj  <  Oi  :  (s,Oi,write,as,at)  — >  (s,Oj,write^def,as,at). 

Rule  23  The  write  privilege  on  a  database  implies  the  privilege  to  delete  the  defini¬ 
tion  of  every  class  of  the  database.  The  negation  of  the  write  privilege  propagates  in 
the  same  way.  Formally,  V.s  G  5,  Vo,-  G  Database, G  Class, Mas  £  {4-,-},Vat  G 
{st,^uk},Oj  <  Oi  :  (s,Oi,write,as,at)  — *  (s,Oj,  delete. def ,  as,  at). 

Rule  24  The  write  privilege  on  a  database  implies  the  create  privilege  on  every  class 
of  the  database.  The  negation  of  the  write  privilege  propagates  in  the  same  way. 
Formally,  V.s  G  .S', Vo,  £  Database ,'ioj  G  Class, Mas  £  {-l-,-},Vat  G  {st,wk},Oj  <  Oi  : 
{s,Oi,  u>rite,as,at)  {s,Oj,  create,  as,  at). 

Rule  25  The  privilege  to  read  an  attribute  on  a  class  implies  the  privilege  to  read 
the  attribute  on  every  instance  of  the  class.  The  negation  of  the  privilege  to  read  an 
attribute  on  a  class  propagates  in  the  same  way.  Formally,  Ms  £  S,  Vo,  G  Class,  MAk  G 
Setof.attr{o{),Moj  £  Instance,Mas  £  {+,-}, Vot  G  {.st,wk},Oj  <  Oi,: 
(s,Oi,read(Ak),as,at)  —>■  {s,Oj,  read{Ak),as,at). 
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Rule  26  The  privilege  to  modify  an  attribute  on  a  class  implies  the  privilege  to  mod¬ 
ify  the  attribute  on  every  instance  of  the  class.  The  negation  of  the  privilege  to  mod¬ 
ify  an  attribute  on  a  class  propagates  in  the  same  way.  Formally,  Vs  G  S,'ioi  G 
Class, VAk  G  S€tof.attr{oi),'ioj  G  Instance, 'ias  G  {+,-}, Vat  G  {st,wk],Oj  <  Oi  : 
(s,Oi,write{Ak),as,at)  —>■  (s,Oj,write{Ak),as,at). 

Rule  27  The  delete  privilege  on  a  class  implies  the  privilege  to  delete  every  instance 
of  the  class.  The  negation  of  the  delete  privilege  on  a  class  propagates  in  the  same 
way.  Formally,  Vs  G  5, Vo,  G  Class, Voj  G  Instance,'^as  G  {-f-,— },Vat  G  {st,wk},Oj  < 
o,  ;  is,Oi,  delete,  as,  at)  -+  {s,Oj,  delete,  as,  at). 

All  previous  implication  rules  propagate  authorizations  top-down  with  respect  to 
the  object  granularity  hierarchy.  We  introduce  now  two  implication  rules  that  propa¬ 
gate  authorizations  bottom-up  with  respect  to  the  object  granularity  hierarchy. 

Rule  28  The  privilege  to  read  an  attribute  of  an  instance  implies  the  privilege  to 
read  the  definition  of  the  class  to  which  the  instance  belongs.  Formally,  Vs  G  5",  Vo,  G 
Clnss.yAk  G  Setof-attr{oi),'ioj  G  Instnnce.'dat  G  \st,wk],Oj  <  Oi  : 

(s,  Oj.  read{Ak),  +,at)  ->  (s,  Oi,  read^def,  +,at). 

Rule  29  The  privilege  to  read  the  definition  of  a  class  implies  the  privilege  to  read 
the  definition  of  the  database  which  contains  the  class.  Formally,  Vs  G  5,Voi  G 
Database, 'doj  G  Class, ^at  G  {st,xck},Oj  <  Oi  : 

(s,  Oj,  read.def ,  -f,  at)  — ^  (s,  o;,  read^def ,  +,  at). 

4.4  Implication  rules  along  the  inheritance  hierarchy 

Implication  rules  along  the  inheritance  hierarchy  allow,  given  an  authorization  for  a 
privilege  on  a  class,  to  derive  authorizations  for  the  privilege  on  the  subclasses  of  the 
class.  Implication  of  authorizations  along  the  inheritance  hierarchy  may  be  desirable 
in  some  cases  and  non  desirable  in  other  cases  [13,  6,  10].  Hence,  we  allow  the  user 
defining  a  class  to  indicate  whether  he  wants  implication  of  authorizations  along  the 
inheritance  hierarchy.  If  so,  the  authorizations  of  users  on  the  superclasses  for  the 
create  and  delete  access  mode  and  for  reading  and  writing  attributes  propagate  to 
the  class.  The  reason  why  privileges  on  the  definition  do  not  propagate  is  that  these 
privileges  should  be  reserved  to  the  creator  of  the  class.  Note  however  that  if  a  user 
receives  a  privilege  on  the  class  he  indirectly  receives  the  read^def  privilege  on  the  class, 
according  to  the  implication  rules  of  Section  4.2. 

To  determine  whether  a  subclass  inherits  the  authorizations  from  a  superclass,  we 
introduce  function  IsJnh{o',o)  which,  given  a  class  o'  and  one  of  its  superclasses  o 
returns  True  if  o'  inherits  the  authorizations  specified  on  o;  returns  False,  otherwise. 

Note  that  authorizations  applicable  to  a  specific  attribute  can  be  propagated  to 
a  subclass  only  if  the  attribute  is  inherited  by  the  subclass.  To  determine  this,  we 
introduce  a  function  Attr_Inh(Af;.o',o)  which,  given  an  attribute  Ak  and  classes  o  and 
o'  returns  True  if  o'  inherits  Ak  from  o;  returns  False  otherwise. 

Then,  the  implication  of  authorizations  along  the  inheritance  hierarchies  is  deter¬ 
mined  according  to  the  following  rules. 
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Rule  30  The  authorization  (negation)  to  create  and  delete  on  a  class  o  propagates 
to  all  subclasses  o'  of  o  for  which  function  Is.Inh{o',o)  returns  True.  Formally,  Vs  £ 

5,  Vo,  o'  £  Class, 'im  £  {create,  delete}, 'i as  £  {+,-},  Vat  £  {st,'wk},o'  -<  o,IsJnh{o' ,o)  = 
True;  (s,  o,  m,  as,  at)  (s,  o',  m,  as,  at). 

Rule  31  The  authorization  (negation)  to  read/write  an  attribute  Ak  on  a  class  o  prop¬ 
agates  to  all  the  subclasses  o'  of  o  such  that  functions  IsJnh{o',  o)  and  AttrJnh{Ak,  o',  o) 
return  True.  Formally,  Vs  £  S,'io,o'  £  Class,'im  £  {read ,  write) ,'i Ak  £  Setof.attr{o), 

^as  G  {  +  .-},'iat  £  {st,wk},o'  -<  o,  IsJnh(o' ,o)  -  True,  AttrJnh{Ak,o',o)  =  True: 

(s,  o,  m{Ak),  as,  at)  — >  (s,  o',  m{Ak),  as,  at). 

The  implication  rules  along  the  inheritance  hierarchy  are  illustrated  in  Figure  5. 
Label  E  associated  with  the  arcs  indicates  the  inheritance  relationship. 


Legenda: 


CLASSES 


31 


E 


— read(Ak) 


31 


' — write!  Ak) 


=  positive  implication  rule  n 
=  negative  implication  rule  n 


E  =  inheritance  relationship 


Figure  .5:  Graphical  representation  of  the  implication  rules  along  the  inheritance  hierarchy 


4.5  Derivation  of  implicit  authorizations 

Given  the  authorizations  specified  by  the  users  (explicit),  the  rules  illustrated  in  the 
previous  section  allow  new  authorizations  (implicit)  to  be  derived.  The  rules  given  for 
the  different  domains  can  be  jointly  applied  thus  allowing,  from  an  authorization,  the 
derivation  of  other  authorizations,  with  same  or  different  subject,  same  or  different 
object,  and  same  or  different  access  mode. 

Figure  6  illustrates  the  graph  representing  all  the  implication  rules  of  our  model, 
except  for  the  rule  specified  for  the  set  of  subjects.  In  this  graph,  we  use  one-colored 
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axes.  An  arc  lalx'led  with  +  (  — )  represents  the  implication  for  a  positive  (negative) 
authorization.  For  sake  of  clarity,  the  arcs  are  not  labeled  with  the  corresponding  rule 
number. 

Derivation  of  authorizations  according  to  the  rules  corresponds  to  traversing  the 
arcs  of  the  graph.  The  set  of  authorizations  implied  by  an  explicit  authorization  a  is 
called  the  extension  of  a.  We  now  illustrate  how  to  determine  the  extensions  of  strong 
and  weak  authorizations. 

Since  strong  authorizations  do  not  admit  exceptions,  the  extension  of  a  strong 
authorization  is  composed  of  all  authorizations  which  can  be  derived  from  a  by  applying 
the  rules.  This  is  formalized  by  the  following  definition. 

Definition  6  (Extension  of  a  strong  authorization)  The  extension  of  a  strong 
authorization  a  is  the  .set  E{a)  defined  as:  E{a)  =  {a'  |  a  o',  m  >  0}. 

Note  that  a  belongs  to  this  set  (m  =  0). 

Determining  the  extension  of  a  weak  authorization  is  more  complex.  Indeed,  since 
weak  authorizations  can  have  exceptions,  the  set  of  authorizations  derivable  from  an 
explicit  weak  autliorization  depends  also  on  the  content  of  the  strong  and  weak  autho¬ 
rization  bases.  Hence,  before  defining  the  extension  of  a  weak  authorization  we  give 
some  definitions  to  determine  when  a  weak  authorization  is  overridden  by  exceptions. 

Definition  7  (More  specific  subject)  Given  two  subjects  s,s'  £  S,  s'  is  more  spe- 
cific  than  s,  written  s'  <!  5,  if  and  only  if  s'  <„  s  with  n>l. 

The  above  definition  states  that  subject  s'  is  more  specific  than  subject  s  if  and 
only  if  s'  is  a  member  (direct  or  indirect )  of  s.  In  the  following,  notation  s'  <  s  indicates 
that  either  s'  —  s  or  s'  <]  s. 

Definition  8  (More  specific  object)  Given  two  objects  o,  o'  £  O,  o'  is  more  specific 
than  o.  written  o'  <3  o,  if  and  only  if  either  o'  <„  o  or  o'  o,  with  n  >  1. 

Definition  8  states  that  object  o'  is  more  specific  than  object  o  if  either  o'  is  descen¬ 
dant  ol  o  in  iiie  ob  ject  granularity  hierarchy,  or  o'  is  a  subclass  of  o. 

Definition  9  (More  specific  access  mode)  Given  two  access  modes  m,  m'  £ 
M{class)  LJ  M  {insiance),  m'  is  more  specific  than  m,  written  m'  <]  m,  if  and  only  if 
m'  £  ■{  ivrilei  Af:).  reeiefiAf;)},  and  m  £  {unite,  rccul}. 

Definition  9  states  that  an  access  mode  referred  to  a  single  attribute  is  more  specific 
than  an  access  mode  referred  to  a  set  of  at  tributes.  Note  that  according  to  Definition  9 
the  write  access  mode  on  an  attribute  is  considered  more  specific  than  the  read  access 
mode  on  an  instance.  Moreover  the  read  access  mode  on  an  attribute  is  considered  more 
specific  than  t  he  write  access  mode  on  an  instance.  These  relationships  hold  because 
of  the  implication  relationship  existing  between  the  two  access  modes  (Section  4.2). 

Definition  10  (More  specific  authorization)  Given  two  authorizations  a,  a' £  WAB, 
authorization  a'  is  more  specific  than  authorization  n,  written  a'  <3  a,  if  and  only  if  any 
of  the  following  conditions  is  satisfied: 

1.  s(n')  <  s(n),o(a')  =  o{a),m{n')  <3  m(a); 


18 


2.  s{a')  ^  s{a),o{a.')  <1  o(a); 

3.  s{a')  <1  s(fi),o(a')  =  o{a),m(a)  ^m(a'). 

Definition  10  states  that  an  authorization  a'  is  more  specific  than  an  authorization 
a  if  and  only  if  either  (1)  the  subject  of  a'  is  more  specific  than  or  equal  to  the  subject 
of  a,  the  object  of  a  and  o'  are  equal,  and  the  access  mode  of  o'  is  more  specific 
than  the  access  mode  of  o;  (2)  the  subject  of  o'  is  more  specific  than  or  equal  to  the 
subject  of  o,  and  the  object  of  a'  is  more  specific  than  the  object  of  o;  or  (3)  the 
subject  of  a'  is  more  specific  than  the  subject  of  o,  the  objects  of  a  and  o'  are  equal, 
and  the  access  mode  of  a  is  not  more  specific  than  the  access  mode  of  a'.  The  last 
condition  of  item  (3)  is  needed  to  avoid,  given  two  authorizations  a  and  a'  such  that 
.s(a')  <]  .s(ct)  and  o(a')  =  o{a),  considering  a'  <]  a.  if  m{a)  <1  m{a').  For  example,  given 
authorizations  a  —  (G'2,Empl,rcc(d(Name),+,  wfc)  and  a'  —  (Bob,Empl,rcad,— , rcfc),  a' 
cannot  be  considered  more  specific  than  a  since  s(fl')  <1  s{a)  (Bob  <1^2),  o(a')  —  o(a) 
(=  Empi).  but  m{a)  <!  m{a')  (rear/(Name)  <1  read). 

Example  1  Consider  authorizations: 

•  n\  =  (C7'2. Employees, write, +. 

•  02  =  (Bob. Employees, read(Salary), —,  wfc); 

•  «,3  =  (G2, Empi, write, +,  wA:); 

•  04  =  (Bob, Employees, (ie/cte, —  ,  n’A:). 

0-2  <1  since  •s(a2)  <1  ■5(oi),o(a2)  —  '^(oi)5^(fl2)  <!  m(ai)  (Definition  10,  item  1). 

fls  <1  U]  since  =  s(ai),o(a^)  <\  0(04)  (Definition  10,  item  2). 

04  <]  «!  since  5(04)  <1  s(fii),o(n4)  =  o(ai),m(ni)  777(04)  (Definition  10,  item  3). 

We  now  define  when  a  weak  authorization  is  overridden.  We  distinguish  the  cases 
where  a  weak  authorization  is  overridden  by  a  strong  authorization  or  by  another  weak 
authorization. 

In  the  following,  given  an  authorization  a  =  (s,o,m.,as,at),  notation  |a|  denotes 
the  sot  composed  of  authorization  a  and  its  negation  (i.e.,  an  authorization  with  the 
same  subject,  object,  access  mode,  and  type  as  authorization  a,  but  with  different 
sign).  That  is,  |fl|  =  {(s,  o.  777, +,  0,/).  (.s|o,  777.  — .  nt)}.  Given  a  weak  authorization 
a  =  {s.o,m,as,wk),  \a\^  denotes  the  set  composed  of  authorization  authorization  a' 
and  the  negation  of  a',  where  a'  is  a  strong  authorization  with  the  same  subject,  object, 
access  mode,  and  sign  as  a.  That  is.  \a\^  ~  {(.s.  o,  777, -f,  .s/),  (s,  o,  tti,  — ,  5t)}. 

Definition  11  (Strong  overriding)  Given  two  authorizations  fi,nt  such  that  a  £ 
ASF,at{a)  —  lok.ai;  £  SAB,  we  say  that  overrides  a,  written  >  a  if  and  only  if 
3n;  £  G,-,7n  >  0. 

Definition  11  states  that  a  strong  authorization  a;,  overrides  a  weak  authorization 
a  if  flfc  implies  (or  is  equal  to.  if  m=0)  an  authorization  with  same  subject,  object  and 
access  mode  as  a  and  different  type. 

The  overriding  relationship  among  weak  authorizations  is  more  complex.  In  par¬ 
ticular,  in  order  to  define  whether  an  authorization  ajt  overrides  an  authorization  a, 
the  authorization  from  which  a  has  been  derived  must  be  considered.  The  overriding 
relationship  among  weak  authorizations  is  formalized  by  the  following  definition. 
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Definition  12  (Weak  overriding)  Given  three  authorizations  a,  a^,  a' such  that  a, 

G  WAB,  a  a',i  >  1,5(0*)  =  s(a'),o(a*)  =  o(a'),  o*  overrides  a',  written  a*  ^  a'  if 
and  only  if  3a,  G  |«'l,  a*  a,-,  n  >  0,  a*  <1  a. 

Definition  12  states  that  an  explicit  weak  authorization  a*  overrides  an  authoriza¬ 
tion  a'  implied  by  an  exphcit  weak  authorization  a  if  and  only  if  the  subject  and 
the  object  of  a*  are  equal  to  the  subject  and  the  object  of  a',  authorization  a'  or  its 
negation  is  imphed  authorization  a*,  and  the  authorization  a*  is  more  specific  than 
authorization  a. 

Given  the  above  definitions,  the  extension  of  a  weak  authorization  is  defined  as 
follows. 

Definition  13  (Extension  of  a  weak  authorization)  The  extension  of  a  weak  au¬ 
thorization  a  is  the  set  E{a)  defined  as:  E{a)  =  En{a) 

where 


r  /'  'i  _  j  G  SAB,  a*  >  a 

°  ^  ^  0  otherwise 

En{a)  —  {a'  I  3a"  G  E„_i(a),  ^a*  G  (SAB  U  WAB),  a"  ^  a',  a*  >  a'}, 

and 

Na  is  the  first  n  such  that  £'„4.i(a)  =  0. 

The  existence  of  such  an  is  ensured  by  the  fact  that  the  implication  rules  are 
finite  and  they  work  on  finite  lattices. 

Definition  13  states  that  the  extension  of  a  weak  authorization  a  is  composed  of  all 
the  authorizations  which  can  be  derived  by  a  and  which  are  not  overridden.  Note  that 
if  a  ^  Eo{a),  then  set  E{a)  is  empty. 

Example  2  Consider  the  subject  graph  illustrated  in  Figure  1  and  the  object  hierarchy 
illustrated  in  Figure  2.  Suppose  that  SAB  =;  0,  WAB  =  {ai,a2}  where: 

•  ai  —  (Ge, Employees, read,-f ,  infc); 

•  02  =  (Mary,Empl,rea(/,— ,  wA;). 

Let  us  analyze  how  authorization  oi  propagates.  For  sake  of  clarity  we  do  not 
derive  all  extension  of  aj  but  only  some  authorizations.  First  of  all,  oi  G  E(ai)  since 
no  strong  authorization  exists  which  overrides  oi.  Moreover,  03,04  G  E(ai)  where: 

•  03  =  (G6,Empl,read,-|-,  ujA;); 

•  04  =  (Mary,Employees, read, -f,  wA:). 

Indeed,  Oi  03,04  -+  04,04  G  E(o4)  and  no  authorization  exists  which  overrides 
03  or  04.  By  contrast,  authorization  o  =  (Mary,Empl, read, -f ,  toA:)  0  E(ai).  Even  if 
03  — >  0,03  G  £4(04)  and  04  0,04  G  £4(04),  o  0  E(ai)  since  02  G  WAB  and  02  >  o. 

In  fact,  according  to  Definition  12,  02  -e-o  a'  -  (Mary, Empl, read, -,wk),  a'  G  |a|,  and 
02  <1  04.  This  last  relationship  comes  from  Definition  10  since  5(02)  <3  ^(ai)  (Mary 
<|G6)  and  0(02)  <3  0(04)  (Emp4  <3  Employees). 

Figure  7  illustrates  the  propagation  of  authorization  04 . 
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Instance[Empl] 


s2:  Mary,  read 


Legenda: 

=  explicit  positive  authorization 
ci:>  =  implicit  positive  authorization 
=  explicit  negative  authorization 


=  link  between  elements 
of  the  object  hierarchy 

=  positive  implication 


Figure  7:  An  example  of  derivation  of  implicit  authorizations 

5  Authorization  State 

The  authorizations  valid  at  a  given  time  are  all  the  authorizations  explicitly  defined 
by  the  users,  or  derived  from  the  authorizations  explicitly  defined  by  the  users,  which 
are  not  overridden.  The  set  of  authorizations  valid  at  a  given  time  is  called  the  autho¬ 
rization  state,  formally  defined  as  follows. 

Definition  14  (Authorization  State)  The  authorization  state  AS  C  ASP  is  a  set 
of  authorizations  defined  as  follows: 

AS  =  U  Eia) 
aeSABuWAB 

The  simultaneous  presence  of  two  (explicit  or  implicit)  authorizations  a  and  a'  equal 
but  for  the  sign  (of  =  ->a)  and  such  that  no  one  overrides  the  other  is  interpreted  as 
an  inconsistency  in  our  model.  This  is  formalized  by  the  following  definition. 

Definition  15  (Consistent  authorization  state)  An  authorization  state  A5  is  con- 
sisfent  if  and  only  if  ^a,a'  G  AS  such  that  a'  =  -lO. 

Inconsistencies  are  not  allowed  in  our  model  and,  accordingly,  the  following  invari¬ 
ant  must  hold. 

Property  1  (Consistency  of  the  AS)  The  authorization  state  is  consistent. 

Each  time  an  authorization  is  granted,  the  system  determines  whether  the  insertion 
of  the  authorizations  would  introduce  an  inconsistency  in  the  authorization  state.  If 
so,  the  grant  operation  is  rejected. 

Example  3  Consider  to  the  subject  graph  illustrated  in  Figure  1  and  the  object  hi¬ 
erarchy  illustrated  in  Figure  2.  Suppose  that  SAB  =  0,  WAB  {a\,a-2}  where; 
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•  ai  =  (Bob, Administration, rearf.rfe/, wA;); 

•  02  =  (Bob,Emp2,re«d(Address),+,  mA:). 

Authorization  Oi  states  that  user  Bob  is  denied  to  read  the  directory  of  the  database 
Administration;  authorization  02  states  that  user  Bob  is  authorized  to  read  the  at¬ 
tribute  Address  of  instance  Emp2  of  class  Employees. 

Those  authorizations  generate  an  inconsistency  in  the  authorization  state,  since 
authorizations  03  €  B(ai)  and  04  E  ^(02)  where: 

•  03  —  {Bob,Employees, read.de/, toA;), 

•  04  =  (Bob,Eraployees,reod_de/,-|-,  wA;), 

which  are  one  the  negation  of  the  other,  belong  to  the  authorization  state  (Fig¬ 
ure  8). 


Database[Administration]  al.  Bob,  reacl.del 


INCONSISTENCY 


Class[Employees] 


a3:  Bob,  read_def 


^^TSob,  read_def^ 


Instance[Emp2]  <;;;^Bob,  read(Address^ 


Legenda: 

=  explicit  positive  authorization 
^  =  implicit  positive  authorization 
i '■  j  =  explicit  negative  authorization 

I  I  =  implicit  negative  authorization 


=  link  between  elements 
of  the  object  hierarchy 

=  positive  implication 

=  negative  implication 


Figure  8:  An  example  of  inconsistency  in  the  authorization  state 


6  Access  control 

In  this  section,  we  illustrate  how  access  control  is  performed.  An  access  request  can 
be  characterized  as  a  3-tuple  (u,o,  m)  with  u  &  U,o  E  0,m  E  M,  indicating  that  user 
u  requests  to  exercise  access  mode  m  on  object  o. 

The  access  control  determines  whether  fully  grant,  partially  grant,  or  deny  the 
access  to  the  user.  Consider  a  request  (u,o,  m).  The  access  control  performs  the 
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following  steps.  First,  the  system  checks  whether  m  is  executable  on  o  (for  instance, 
the  reacLdef  access  mode  on  a  class  allows  to  read  the  definition  of  the  class)  or  on  its 
components  (for  instance,  the  read  access  mode  on  a  class  allows  to  read  all  attributes 
of  ah  instances  of  the  class).  In  the  first  case,  the  access  request  is  granted  if  there 
exists  in  the  authorization  state  a  positive  authorization  which  satisfies  the  request, 
and  it  is  rejected  otherwise.  In  the  second  case,  the  access  request  specified  by  the 
user  is  spht  into  a  set  of  elementary  access  requests  with  u  as  subject,  a  component  of 
o  as  object,  and  an  access  mode  related  to  m  (according  to  the  rules)  as  access  mode. 
If  all  elementary  access  requests  are  authorized  (i.e.,  a  positive  authorization  exists 
for  each  of  them  in  the  authorization  state),  the  system  fuUy  grants  the  access  to  the 
user:  if  none  of  the  elementary  access  requests  are  authorized,  the  system  denies  the 
access  to  the  user;  if  only  some  of  the  elementary  access  requests  are  authorized,  the 
system  partially  grants  the  access  to  the  user  returning  the  subset  of  elementary  access 
requests  that  are  authorized. 

The  access  control  can  be  represented  by  a  function  b  defined  as  follows; 

b-.UxOxM^  X  {True, False}. 

Given  the  access  request  {u,o,m),  function  b  returns  ((?i,  o,  m).  True)  if  the  access  re¬ 
quest  is  fully  granted;  it  returns  ((u,  o,  m).  False)  if  the  access  request  is  rejected;  it 
returns  the  set  {((u,  o,-,  mt).  True)}  (i.e.,  the  set  of  elementary  access  requests  autho¬ 
rized)  if  the  access  request  specified  by  is  partially  granted. 

Example  4  Consider  the  authorization  state  illustrated  in  Figure  7  and  access  request 
(Mary,Employees, read)  which  states  that  user  Mary  requests  to  read  aU  instances  of 
class  Employees. 

The  function  of  access  control  b  returns  the  following  set:  {{(Mary,Emp,,read),True)} 
for  all  1^1.  In  fact,  user  Mary  is  implicitly  authorized  to  read  all  instances  of  class 
Employees,  except  for  instance  Empl.  Thus,  the  access  request  (Mary,Employees, read) 
is  partially  granted. 

7  Conclusions 

The  semantic  concepts  of  object-oriented  database  systems  make  traditional  access 
control  model  developed  for  operating  systems  and  traditional  databases  systems  inad¬ 
equate.  Work  in  the  area  of  authorization  models  for  object-oriented  systems  is  still  at 
a  preliminary  stage  and  many  questions  are  left  open.  In  this  paper  we  have  presented 
an  authorization  model  for  object-oriented  databases.  The  model  supports  both  posi¬ 
tive  authorizations  (meaning  permission  to  do  something)  and  negative  authorizations 
(meaning  denial  to  do  something).  From  the  authorizations  specified  by  the  users  the 
system  derives  new  authorizations  on  the  basis  of  the  relationships  existing  among 
subjects,  objects,  access  modes,  and  on  the  inheritance  hierarchies.  The  model  allows 
authorizations  to  be  overridden  by  permitting  two  types  of  authorizations  (strong  and 
weak).  Strong  authorizations  cannot  be  overridden  whereas  weak  authorizations  can 
be  overridden  according  to  specified  rules.  In  the  paper  we  have  given  the  rules  for  the 
derivation  and  the  overriding  of  authorizations. 
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Abstract 

In  this  paper  we  will  describe  how  constraints  involving  integrity  and 
security  can  be  specified  in  the  active  object-oriented  knowledge-base  sys¬ 
tem  MOKLM.  Also  we  will  indicate  how  they  are  implemented. 
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1  Introduction 


In  a  knowledge  base  system  constraints  are  very  important  entities  which  should 
be  treated  carefully.  In  particular  for  active  knowledge  base  systems  which 
are  supposed  to  realize  Information-  and  Communication  Systems  (ICS),  which 
connect  people  and  information  systems.  One  has  to  accommodate  for  the  fact 
that  people  willfully  and  unwillfully  are  changing  data  and  rules  about  the  data. 
It  is  of  paramount  importance  that  these  changes  are  governed  by  rules  in  the 
form  of  constraints  and  security  checks,  which  are  maintained  by  the  knowledge 
base  system  in  a  safe  way.  For  the  quality  of  the  data  the  constraints  are 
important,  while  access  rules  must  give  security  in  order  to  be  able  to  guarantee 
protection  and  privacy  of  people’s  data.  Both  kinds  of  rules  are  interrelated 
very  much,  notified  already  in  one  of  the  earliest  papers  by  the  Ingres  group: 
[Stonebraker,Wong  k  Held,  1976]  and  which  is  not  been  taken  care  of  by  most 
data-  and  knowledge  base  systems:  constraints  on  the  data  are  globally  specified 
in  the  form  of  triggers,  while  security  is  defined  using  a  single  access  matrix. 
See  e.g.  the  overview  paper  [Grefen  k  Apers,  93]  which  deals  with  integrity 
constraints  in  database  systems  and  the  paper  [Paton,Diaz  k  Barja,  93]  about 
constraints  in  the  form  of  rules  in  Object-Oriented  (0-0)  systems. 

Our  current  work  on  integrity  and  security  in  databases  can  be  considered 
as  a  continuation  of  work  by  our  group  in  the  not  so  recent  past  on  security  and 
databases.  We  explicitly  refer  here  to  the  work  on: 

privacy  and  security  and  a  programming  language  approach;  see  [vandeRiet, 
Kersten  k  Wasserman,  82],  [vandeRiet,  Kersten  k  Wasserman,  82],  [Was- 
serman,  vandeRiet,  Kersten  k  Leveson,  1983] 

*  ciccess  control;  see  [Kersten, 85], 

*  keeping  secrets  by  a  knowledge  base  [Sicherman,  deJonge  k  vandeRiet,  83], 

*  statistical  databases  [deJonge,  83]  and 
cryptography  [deJonge,  85]. 

In  the  more  recent  years  our  attention  was  focused  on  Object-Oriented  sys¬ 
tems.  We  developed  the  MOKUM  ’  system,  which  is  an  active  object-oriented 
knowledge  base  system  (see  [vandeRiet, 89]),  Also  more  formal  aspects  of  Con¬ 
ceptual  Modelling  we  worked  on.  The  current  paper  can  be  seen  as  a  study  of 
the  issue  of  Security  within  an  existing  system  as  seen  from  the  standpoint  of 
a  Conceptual  Model.  We  will  describe  how  constraints  involving  integrity  and 
securitv  can  be  specified  in  MOKUM.  In  MOKUM  both  kinds  of  constraints  are 
treated  from  one  viewpoint  and  are  not  separated.  The  basic  mechanism  for 
communication  in  MOKUM  is  the  message,  sent  from  object  to  object.  It  can 
always  be  seen  who  the  sender  of  a  message  is.  Objects  who  send  and  receive 
messages  can  usually  be  identified  with  office  employees  having  certain  rights 
and  responsibilities.  In  this  way  MOKUM  differs  from  usual  0-0  systems  where 
objects  denote  active  pieces  of  software.  In  most  other  0-0  systems  constraints 
on  classes  and  meta-classes  are  considered  themselves  as  active  objects  who  be¬ 
come  active. This  is  not  the  case  in  MOKUM.  We  have  deliberately  designed 
MOKUM  to  be  close  to  the  real  world  applications,  where  persons  and  institu¬ 
tions  can  be  active,  but  where  regulations,  specifications  or  collections  of  things 
cannot  be  active. 

*The  acronym  MOKUM  stands  for  Manipulating  Objects  with  Knowledge  and  Under¬ 
standing  in  Mokum  (= Amsterdam). 
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In  section  2  we  will  give  a  short  introduction  of  the  MOKUM  architecture. 
In  section  3  we  will  give  some  taxonomy  of  constraints  and  security  rules.  In 
sections  4  and  5  we  will  see  how  these  rules  can  be  treated  in  MOKUM,  first 
in  the  form  of  scripts,  then  in  the  form  of  restrictions.  In  section  4  we  will  also 
focus  our  attention  to  security  constraints,  how  they  can  be  implemented  in 
MOKUM  and  some  theory  about  them.  In  the  final  section  we  will  give  some 
conclusions  and  describe  the  status  of  the  MOKUM  system. 


type  person  is^a  thing 

has_a  name:  string 

script  part  of  person  endscript 
type  employee  is_a  person 

has_a  mgr:  employee 

private 

has_a  salary:  int 

script  part  of  employee  endscript 

type  empl_admis.a  employee /*shorthand  for  employee^dministrator*/ 

private 

has_a  nr_of-employees:int 

has_a  employees:  collection  _of  employee 

script  part  of  empLadm  endscript 


Figure  1:  Three  type  definitions  in  MOKUM. 


2  About  MOKUM:  two  levels 

2.1  MOKUM  in  a  nutshell 

In  Information  Systems  it  is  customary  to  differentiate  between  intension  and 
extension.  The  intension  is  the  form  of  the  IS,  called  schema  or  Conceptual 
Model,  while  the  extension  is  the  contents  of  the  IS,  i.e.  the  collection  of  all 
objects  in  the  IS. 

In  MOKUM  w'e  have  made  a  principle  of  this  division:  we  have  type  defi¬ 
nitions  and  instances  of  these  types  in  the  form  of  objects.  Each  object  is  the 
instance  of  one  (or  more)  type(s).  As  shown  in  fig.  1,  objects  of  type  employee 
are  also  objects  of  type  person.  We  adopt  here  the  convention  that  all  identifiers 
which  have  a  pre-defined  meaning  in  MOKUM  are  printed  in  bold  face.  For  an 
example  of  a  script  see  figure  2  in  a  next  section. 

Objects  are  made  by  new  After  their  creation  they  have  type  thing,  after 
w'hich  their  only  property  is  that  they  have  an  identity:  the  object  ..identity, 
which  is  a  uniquely  determined  identifier,  in  principle  not  accessible  in  the  out¬ 
side  world.  After  being  created  they  can  get  other  types,  such  as  person  or 
employee.  For  example,  a  new  person,  later  also  becoming  an  employee,  identi¬ 
fied  by  the  value  of  the  (Prolog)  variable  John  is  created  after: 


new  (John),  add  type  (John,  person,  [(name,  ’Jan’)]), 
addType  (John,  employee,  [(salary,  10.000)]) 

As  customary,  we  say  that  objects  having  a  certain  type  are  instances  of  that 
type.  Because  objects  have  types  they  have  attributes,  which  may  have  values. 
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In  our  case,  John  is  an  instance  of  person  as  well  as  of  employee  so  that  it  has 
the  attributes  name,  salary  and  mgr,  of  which  only  the  first  two  are  filled  in. 
One  can  give  a  value  to  an  attribute  as  follows: 


Pete  to  John:mgr 

Asking  values  of  attributes  can  be  done  as  follows; 
S  from  John;salary 


Because  salary  is  specified  as  private  this  statement  is  protected:  only 
in  the  script  part  of  the  employee  type  of  John  (and  in  the  script  parts  of  its 
possible  subtypes),  this  statement  will  result  in  putting  John’s  salary  in  the 
Prolog  variable  S.  Outside  this  script  part  (with  the  exception  to  be  discussed 
in  connection  with  keepers)  this  statement  will  fail.  Also  this  statement  will 
be  successful  only  when  the  object  who  asks  for  the  salary  is  John  itself  (or 
its  keeper,  see  next  discussion).  For  a  more  detailed  discussion  about  access 
protection  see  section  4.3. 

In  a  type  definition  one  also  can  specify  restrictions,  computed  attributes  and 
scripts.  For  restrictions  and  scripts  we  refer  to  sections  4  and  5;  for  computed 
attributes  we  refer  to  other  MOKUM  documents  (See  e.g.  [yandeRiet,  89]). 

2.2  About  objects,  collections,  types  an  classes 

With  types  one  can  reason  (epistemic).  So  inheritance  of  properties  is  defined 
on  the  level  of  types:  from  the  above  definitions  one  can  infer  that  objects  of 
type  employee  also  have  the  person  attribute  name.  As  far  as  the  object  is 
concerned,  there  is  no  difference  between  John  having  salary  as  attribute  and 
name.  In  our  perspective  it  would  be  wrong  to  say  that  the  object  John  in  any 
way  inherits  some  properties. 

Objects  stand  for  entities  existing  in  the  Universe  of  Discourse,  which  may 
be  put  together  in  collections  (ontology),  which  also  have  a  long  life  as  they 
are  persistent  and  eis  such  stored  in  an  0-0  database.  In  most  0-0  systems 
the  collection  of  all  objects  being  instances  of  a  certain  type  is  called  a  class, 
usually  with  the  same  name.  This  is  confusing,  in  particular  when  that  class  is 
also  called  an  object.  A  consequence  of  this  principle  is  that  MOKUM  does  not 
have  the  notion  of  class,  but  it  has  the  notion  of  collection. 


Definition:  collection  and  its  keeper 

A  Collection  is  a  set  of  objects  having  a  certain  type,  and  it  can  be  defined  only 
as  the  value  of  an  attribute  of  an  object  called  its  keeper. 


It  is  the  task  of  the  keeper  of  a  collection  C  to  maintain  security  and  integrity 
rules  about  C  and  its  members. 

The  elements  of  C  all  have  at  least  one  common  type.  (This  condition  is  not 
a  severe  one,  because,  if  wants  to  put  objects  of  different  types  in  C,  another 
type  can  be  introduced,  being  a  supertype  of  these  different  types  (and  one  could 
take  thing  for  it),  and  these  objects  can  be  made  instances  of  that  supertype. 

Other  attributes  of  K  can  be  connected  to  C,  e.g.  an  attribute  nr_of_elmts  can 
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denote  the  current  number  of  elements  in  C,  while  an  attribute  majcJir_of_elmts 
can  denote  the  majcimum  number  of  elements  in  C.  The  combination  collection 
and  special  attributes  is  what  other  people  call  ’’grouping”,  (see  [Motschnig  & 
Storey,  93]) 


In  the  example  above  we  may  have  several  employee  administrators,  all  in¬ 
stances  of  the  type  empl..adm.  All  have  a  (different)  collection  of  employee 
objects  for  which  they  are  responsible.  These  collections  are  also  protected. 
In  the  script  part  and  in  restrictions  the  access  to  these  collections  and  their 
integrity  constraints  is  regulated. 

In  the  script  part  the  reaction  of  an  object  is  specified  upon  receiving  some 
message.  The  object  can  be  in  different  states  and  dependent  on  the  state  it 
reacts  on  messages  called  triggers.  The  behaviour  of  an  object  can  be  char¬ 
acterized  as  a  Finite  State  Automaton.  There  are  tv,'o  kinds  of  triggers,  one 
is  activated  by  another  object,  called  sender,  and  the  other  is  a  reaction  on  a 
timer,  set  by  the  object  itself.  The  message  itself  is  also  an  object  and  identified 
by  message;  it  must  be  an  instance  of  a  user-defined  type,  being  itself  a  subtype 
of  the  type  message  type.  To  transfer  parameters  in  the  message  one  can  sim¬ 
ply  use  attributes.  In  section  4  we  will  see  some  examples.  For  more  extensive 
examples  one  is  referred  to  [vandePdet,  89]. 

2.3  Some  details  about  the  implementation 

The  current  MOKUM  system  consists  of  a  compiler,  a  kernel,  a  storage  facility 
and  an  animation  facility.  The  compiler  translates  a  knowledge  base  system 
written  in  the  MOKUM  syntax  into  Prolog  predicates.  These  predicates  to¬ 
gether  with  the  kernel  form  a  Prolog  program  which  can  be  run  as  a  simulation 
for  a  real  ICS  system.  Objects  can  be  stored  in  an  INGRES  database  system, 
(see  [vandeRiet  k  Gamito,  90])  but  not  necessarily,  they  can  also  be  stored  in 
the  Prolog  fact  base.  The  anim.ation  facility  (see  [Croshere, vandeRiet  k  Blom, 
93])  makes  it  possible  to  see  what  is  happening  during  a  MOKUM  simulation. 
Objects  are  shown,  for  which  three  windows  are  used,  one  is  the  animation  win¬ 
dow  in  which  one  can  see  the  objects  changing.  The  system  is  really  very  small 
(about  20  pages  of  Prolog  text  for  the  kernel,  20  pages  of  C  code  for  the  interface 
with  INGRES,  20  pages  of  XPC  Prolog  text  for  the  animation  facility  and  also 
about  20  pages  for  the  compiler,  written  in  C).  It  is  mainly  been  used  in  an 
educational  environment,  where  it  should  be  easy  for  students  to  add  facilities, 
such  as  the  facility  discussed  in  this  paper. 

The  addition  of  the  re.strictions  and  constraints  resulted  in  a  system  called 
MOKUM-C. 


3  About  constraints 

A  constraint  is  a  formula  which  specifies  some  properties  of  some  collections  of 
objects.  In  its  most  general  form  several  collections  may  be  involved,  while  the 
constraint  may  refer  to  the  set  properties,  such  as  nr_of  .elmts,  or  combinations 
of  sets,  such  as  the  union  or  the  intersection  of  two  sets,  but  also  to  properties  of 
the  individual  elements,  such  as  the  salary  in  case  of  a  collection  of  millionaires. 
Furthermore,  a  constraint  may  refer  to  a  certain  operation,  e.g.  an  update,  or 
the  addition  of  an  object  to  a  collection  and  the  constraint  may  refer  to  both 
the  value  before  and  the  value  after  the  operation.  Finally,  a  constraint  may 
refer  to  the  context  in  which  it  should  be  seen,  such  as  a  the  object  who  issues 
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the  update. 

In  general,  one  can  say  that  each  time  some  manipulation  on  an  object  or 
a  collection  mentioned  in  a  constraint  or  an  object  being  an  element  of  such  a 
collection  is  performed  (create/change/inspect/destroy),  some  check  has  to  be 
performed. 

3.1  classification  of  constraints 

In  the  following  we  will  give  a  classification  of  the  different  kinds  of  constraints 
possible. 

First,  there  axe  constraints  which  refer  only  to  static  properties  of  the  objects, 
i.e.  properties  which  must  always  hold,  as  soon  as  an  object  or  collection  gets 
a  certain  type  (i.e.  has  indeed  properties  about  which  the  constraint  deals). 

Second,  there  are  dynamic  properties  which  are  attached  to  certain  opera¬ 
tions;  these  properties  deal  with  the  properties  before  and  after  the  operation, 
for  example  that  a  salary  may  only  increase. 

Third,  there  is  context  dependency,  usually  the  object  on  whose  behalf  the 
operation  is  carried  out  is  attached  to  the  operation.  (In  MOKUM  terminol¬ 
ogy;  the  sender  of  the  message).  There  may  be  constraints  restricting  these 
operations.  Usually  called  authorization  constraints.  We  are  working  in  an  en¬ 
vironment  that  objects  may  stand  for  persons  who  want  to  change  or  inspect 
certain  properties,  such  as  their  salary.  We  assume  that  real  persons  can  sit  at 
a  workstation  and  be  connected  to  their  person  object  counterpart.  One  can 
imagine  that  the  properties  this  real  person  can  see  and  change  are  the  person 
properties,  but  the  properties  as  employee,  such  as  salary,  are  properties  not 
freely  available.  These  properties  are  only  available  through  the  interference  of 
the  employee  administrator. 

The  following  is  a  possible  list  of  different  kinds  of  constraints: 

Cl  Constraints  on  attributes  of  one  object  only: 

Cl.l  single  attribute  constraint;  example: 

0  <  age  <  140 

Cl. 2  two-attribute  constraint;  example: 
nr  of  elmts  <  max  nr  of  elmts 

Cl. 3  constraint  concerning  new  and  old  value;  example: 
new. age  >  old. age 

C2  Constraints  on  attributes  of  two  objects 
C3  Constraints  on  one  collection 

C3.1  The  properties  of  the  members  of  the  collection  are  not  involved; 
example; 
tt(C)  <  10 

C3.2  Only  the  properties  of  the  members  of  the  collection  are  involved; 
example: 

Vj,gc  :  X. salary  >  1000.000  or  ^ 

Begc  :  e. function  =  boss 

C3.3  A  combination:  example: 


it(C)  <  10  A  Bigc  :  X. salary  >  1000.000 


C4  Constraints  on  two  or  m.ore  collections  C  and  D: 

C4.1  intersection  constraint:  example: 

(CnD)  =  0 

C4.2  constraint  involving  a  cardinality:  example: 
tt(CUO)  <  10 

The  operations  we  have  to  look  at  are  the  following:  Suppose  an  object  0 
is  also  a  member  of  a  collection  C. 

OPl  only  on  the  single  object 

OPl.l  O’  s  creation 
OP  1.2  O  gets  a  new  type 

OPl. 3  some  attribute  of  0  gets  a  value,  without  having  one  before; 

OP  1.4  some  attribute  of  0  gets  another  value; 

OPl. 5  0  is  i;sed  as  value  of  an  attribute  of  (another)  object 
OPl. 6  O  is  destroyed 

OP2  only  on  the  collection 

OP2.1  C  is  created 
OP2.2  C  is  destroyed 

OPS  on  the  relation  of  the  object  and  the  collection 

OP3.1  O  is  added  into  C 
OPS. 2  O  is  deleted  from  C 

The  constraints  which  involve  old  and  new  values  are  all  attached  to  certain 
operations,  usually  update  operations. 

The  above  operations  may  all  be  connected  to  the  actor  of  the  operation, 
i.e.  the  object  who  issued  the  request  to  carry  out  this  operation  which  must  be 
entitled  to  do  so.  In  principle  the  context  can  involve  the  time  of  the  day  and 
or  conditions  of  other  objects.  Example;  a  nurse  can  inspect  a  file  of  a  patient 
when  the  responsible  physician  is  not  present. 

There  is  a  large  number  of  literature  about  the  efficient  maintenance  of 
integrity  and  security  con.straints,  we  only  refer  to  [Weigand,  93]  for  an  overview. 


3.2  Security  constraints 

When  a  constraint  involves  explicitly  the  context  in  which  an  operation  is  taken 
place  we  say  that  it  is  a  security  constraint.  For  static  constraints  there  is 
no  security  involved,  only  when  the  constraints  are  dynamic  one  can  speak  of 
security.  In  this  case  some  operation  has  to  be  carried  out  on  some  piece  of 
the  knowledge  base  and  the  operation  is  carried  out  on  behalf  of  some  entity, 
usually  a  person.  The  context  is  represented  by  the  invoker  of  the  operation  and 
possibly  some  other  circumstantial  information,  such  as  the  time  of  the  day,  or 
mode  of  operation  (e.g.  urgency).  A  typical  situation  is  the  Automatic  Teller 
Machine;  dependent  on  the  person  who  uses  the  ATM  and  dependent  on  the 
amount  of  cash  available,  money  can  be  withdrawn,  which  leads  to  an  update 
of  the  client’s  account  in  the  bank’s  database.  The  operation  is  an  update  on 
the  account  of  the  database,  the  context  consists  of  invoker  and  the  state  of  the 
ATM. 
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In  some  security  systems  one  has  levels  of  security.  For  example,  in  a  military 
environment  the  general  can  see  and  do  more  than  a  soldier.  The  information 
is  characterized  as  top-secret,  secret,  confidential  and  non-confidential,  say.  We 
shall  see  in  the  next  section  how  such  a  security  system  can  be  specified  in 
MOKUM. 


4  Constraints  in  MOKUM  scripts 
4.1  Scripts  in  MOKUM 

In  MOKUM  there  is  a  natural  place  where  these  checks  can  be  executed,  namely 
at  the  script  part  of  the  object  or  of  the  keeper  of  the  collection  involved  in  the 
constraints. 

Let  us  assume  that  we  have  an  employee  object,  John,  member  of  a  collection 
employees  with  keeper  Empl.adm.  John  also  has  type  person.  See  figure  1  for 
the  type  definitions. 

We  assume  that  salary  is  a  private  attribute  which  can  only  be  seen  and 
changed  by  the  object  itself  and  by  the  keepers  of  the  collections  the  object  is 
a  member  of.  We  assume  that  John  as  employee  is  a  member  of  the  employees 
attribute  of  the  object  EmpLadm.  There  are  two  places  where  it  is  allowed  to 
manipulate  the  salary  attribute  of  John;  in  its  own  script,  i.e.  the  script  part 
of  employee  and  in  Empl^idm’s  script,  i.e.  the  script  part  of  empl-adm. 

In  principle,  the  compiler  can  syntactically  check  that  indeed  at  no  other 
place  ’’salary”  occurs.  This  is  not  enough  however  as  the  compiler  cannot  see 
that  John  is  referring  his  own  salary.  So  a  dynamic  check  is  also  necessary,  and 
this  is  how  the  kernel  of  MOKUM  works:  operations  on  private  attributes  are 
translated  so  that  this  check  is  carried  out.  To  be  more  precise:  the  operation 
to  ,  which  puts  a  new  value  in  some  attribute  of  an  object  O,  also  has  as 
parameter  the  object  who  is  the  caller,  and  when  the  attribute  is  a  private  one 
it  is  checked  whether  the  caller  object  is  the  same  as  the  object  O.  Or,  whether 
the  caller  object  is  among  the  keepers  of  0.  Evidently,  this  presupposes  that 
we  have  available  for  every  object  a  list  of  its  keepers.  Things  are  somewhat 
more  complicated  because  also  the  type  of  the  collection  has  to  be  administered. 
It  is  possible  that  a  keeper  has  several  types  and  it  is  also  possible  that  it  is 
the  keeper  of  several  collections,  even  of  several  types.  In  actual  practice  when 
objects  are  stored  in  a  separate  database  the  whole  of  operation  and  checking 
is  left  to  the  database  system,  using  some  kind  of  access  table. 

In  any  way,  on  a  low  level,  MOKUM  checks  whether  a  certain  operation  can 
be  performed  by  some  object  in  its  script  part.  This  provides  the  means  for  a 
full  and  most  general  checking  for  the  context  in  which  some  operation  referring 
to  a  private  attribute  is  carried  out.  A  message  must  be  sent  to  the  keeper 
of  the  object,  or  to  the  object  itself.  This  message  should  specify  the  kind  of 
operation  to  be  performed. 

Let  us  assume  for  the  above  example,  that  there  is  no  script  part  for  em¬ 
ployee,  i.e.  the  employee  object  can  only  be  operated  upon  by  EmpLadm.  We 
may  have  as  a  rule  that  an  object  of  type  person  can  only  see  its  own  salary 
while  its  manager  can  change  (and  for  the  sake  of  the  example:  increase)  it.  In 
figure  2  the  script  of  empLadm  is  shown  which  specifies  these  rules: 


Note  that  the  commands  in  figure  2  should  be  read  as  Prolog  predicates: 
if  one  fails  the  rest  is  not  executed.  In  actual  practice  we  should  also  have 
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script 

state  active: 

at^^trigger  see  salary: 

type  .of  (sender)  =  employee,  /*is  the  sender  employee?*/ 
select  (E  in  employees  where  E  =  sender), 
sender  =  E,  /*is  sender  this  employee?*/ 

E:  salary  to  message:  param,  /*return  the  salary*/ 
next  (active). 
at_trigger  change_salary: 

type.  of  (sender)  =  employee,  /*is  the  sender  employee?*/ 
select  (E  in  employees  where  E  =  message;  empl  ), 
sender  =  E:mgr,  /*is  sender  this  employee’s  mgr?*/ 

N  from  message:  param,  /*get  new  salary  from  message*/ 
N  >  E;  salary.  /*is  new  salary  higher?*/ 

N  to  E:  salary,  /*a,ssign  to  salary  attribute*/ 
next  (active). 

endscript  . 


Figure  2:  A  script  for  empl  adm. 


provided  this  specification  with  some  code  for  appropriate  error  messages.  ^ 
Messages  must  have  a  type  defined  as  follows: 


type  message  typel  is  a  message  type 

has  .a  empl:  employee  /*identifier  of  the  employee*/ 

has  a  param:  int  /*for  giving  through  salaries*/ 


From  this  example  one  can  see  that  quite  complex  checks  can  be  specified 
before  some  operation  is  carried  out.  By  providing  the  low-level  identity  checks, 
discussed  above,  it  is  thus  possible  to  create  a  truly  safe  knowledge  base  system 
in  which  the  operations  are  carried  out  only  when  the  appropriate  authorization 
checks  have  been  performed  successfully. 

As  another  example,  take  the  situation  that  a  salary  can  only  be  changed 
when  both  the  m.anager  and  the  boss  are  acting  in  concert.  Suppose  the  boss 
must  first  send  a  message  to  the  Empl  adm  and  within  2  time  units  the  manager 
must  then  send  the  change  salary  message. To  provide  for  a  secure  checking  the 
Empl  adm  gets  an  extra  state;  attention,  which  it  gets  when  the  boss  sends 
that  message.  Only  in  this  state  Em.pl  adm  is  willing  to  listen  to  the  message 
change  salary.  After  2  units  of  time  the  Empl  adm  changes  automatically  in 
the  state  active,  in  which  it  is  not  willing  to  react  on  the  manager’s  message. 
The  script  now  runs  as  follows: 


script 

state  active:  i  •• 

at  . trigger  see  salary:  /*same  as  above*/ 

^Note  that  above  and  in  the  rest  of  this  paper  we  allow  ourselves  some  freedom  as  far  as 
MOKUM  syntax  is  concerned.  Actually,  sender  is  an  attribute  of  message  and  getting  values 
in  and  out  of  attributes  has  to  be  done  in  MOKUM  more  clumsily:  one  should  write:  ”S  from 
message:  sender,  S  =  E”  instead  of  ”  sender  =  E”. 
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next  (active). 

at_trigger  change Jo-attention: 
sender  =  boss,/*is  the  sender  the  boss?*/ 
now+2  to  time -trigger, 
next  (attention), 
state  attention; 

at  trigger  change.salary;  /*same  as  above*/ 
next  (active). 

at_time  time-trigger;/*waited  long  enough*/ 
next  (active). 

endscript 


We  have  shown  that  quite  complex  contextual  security  rules  can  be  put  in 
the  script  part  of  a  keeper  of  a  collection. 

Turning  to  the  list  of  Constraints  in  section  3,  we  can  easily  see  that  all 
kinds  can  be  dealt  with.  Take  C4:  when  there  are  more  than  one  collections 
involved  with  one  keeper,  then  a  keeper  type  can  be  specified  in  which  these 
collections  are  values  so  that  this  keeper  can  be  made  responsible  for  keeping 
all  the  constraints.  When  there  are  more  than  one  keepers  involved,  these  have 
to  communicate  with  each  other.  As  an  illustration,  take  as  example  that  there 
are  two  empl-adm’s,  one  for  the  male  and  one  for  the  female  employees.  Let 
us  call  them  M  and  F.  The  constraint  is  that  their  collections  may  not  overlap. 
M  and  F  have  the  same  type:  empLadm.  In  this  specification  of  EmpLadm 
it  is  not  possible  to  refer  to  M  and  F  directly,  i.e.  by  names  denoting  the  re¬ 
spective  objects.  It  is,  however,  possible  to  specify  an  attribute  which  denotes 
the  ’’other”  keeper.  Let  this  attribute  be  called  ’’otherEA”.  When  M  and  F 
are  created  this  attribute  gets  as  value  F  and  M,  respectively.  The  definition  of 
empLadm  may  now  look  like: 


type  empl  adm 
private 
has  a 
has  a 
has  a 
script 

state  active: 
at  trigger 


otherEA:empl  adm 
nr  of  employees;int 
employees:  collection  of  employee 


add  employee: 
send  (otherEA.  check  membership,  message), 

R  from  message: param, 

(R  =  1,  /*empl  is  member  of  the  other  EA*/  ; 

R  =  0,  /*empl  is  not  member  of  the  other  EA*/ 
message:empl  into  employees), 
next  (active). 

at  trigger  check  membership: 

sender  =  otherEA,  /*only  the  other  EA  is  entitled  to  get  an  answer*/ 
(select  (E  in  employees  where  E  =  message: empl), 

1  to  message: param; 

0  to  message:param), 
next  (active). 

endscript  . 
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It  would  be  interesting  to  see  whether  a  compiler  can  be  made  who  generates 
code  like  this  when  given  the  constraint  in  its  original  form: 


M  :  employees  n  F  :  employees  =  0. 


Looking  now  back  at  the  list  of  possible  operations  we  remark  that  the 
relevant  operations  in  MOKUM.  such  as  new  ,  add_type  ,  from  ,  to  ,  into  , 
out  of  ,  destroy  (for  types  of  objects)  and  delete  (objects),  all  are  used  in 
the  script  part  and  thus  under  the  control  of  the  ICS  designer. 


4.2  Security  in  MOKUM 

Let  us  first  see  how  a  typical  security  system  can  be  implemented  in  MOKUM. 
In  the  next  subsection  some  theoretical  remarks  will  be  given.  Take  the  mil¬ 
itary  situation  as  sketched  in  section  3.2.  Suppose  the  objects  of  interest  are 
documents  characterized  as  top-secret  (1 ),  secret  (S),  confidential  (C)  and  non- 
confidential  (N)  and  that  they  are  to  be  Inspected  by  generals  (G),  lieutenants 
(L)  and  soldiers  (S).  Gs  may  see  documents  characterized  S,  C  and  N,  Ls  may  see 
C  and  N  documents  and  Ss  may  see  only  N  documents.  The  way  to  implement 
a  safe  and  secure  system  in  MOKUM  is  to  put  the  documents  in  a  collection, 
and  to  install  a  keeper  of  that  collection.  The  keeper  is  called  Security  Officer. 
The  documents  have  an  attribute  sensitivity  with  possible  values:  3,  2,  1  and  0. 

Another  attribute  of  a  document  is:  contents,  in  which  the  contents  of  the 
document  is  put.  Both  attributes  are  supposed  to  be  private.  So,  only  Security 
Officers  can  manipulate  both  attributes.  Military  persons  have  a  type  which  is 
a  subtype  of  person,  with  (at  least)  one  extra  private  attribute:  clearance,  with 
possible  values:  3,  2  and  1.  It  is  supposed  that  the  particular  object  in  charge  of 
determining  the  clearance  of  a  mil  person  is  mpm,  having  type  miUperson  jngr. 

A  possible  definition  of  the  types  is  given  in  figure  3. 


type  mil  person  is  a  person 

private 

has  a  clearance:  int 

type  document  is  a  thing 

has  a  identification:  string 

private 

has  a  sensitivity:  int 

has  .a  contents:  string 

type  mil  person  mgr  is  a  mil  person 

private 

has  a  mil  persons:  collection  of  mil  person 
type  security  officer  is  a  mil  person 

private 

has  , a  documents:  collection  of  document 


Figure  3:  Type  specification  of  security  problem 

In  the  script  part  of  the  security. officer  the  important  security  checking  can 
be  specified,  as  in  figure  4. 
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script 

state  active: 

at^trigger  inspect  ^document: 

type_of  (sender)  =  mil-person, 

select  (D  in  documents  where  D: identification  =  message:id), 
new  (M), add  type  (M,message_type2,[(paraml, sender)]), 
send  (mpm, ask  jbr clearance, M), 

/*the  clearance  is  returned  in  param2  of  the  message  M*/ 
Clearance  from  M:param2,  destroy  (M), 

Clearance  >=  Drsensitivity, 

/*the  test  is  successful*/ 

D:  contents  to  '  message:  param; 

/*the  test  fails  and  no  information  is  disclosed*/ 
next  (active). 

endscript  . 


Figure  4:  A  script  for  the  security  problem. 

As  can  be  seen  from  this  example,  it  is  very  important  to  have  a  close  connec¬ 
tion  between  the  internal  person  objects  and  the  real  persons.  It  is  considered 
part  of  the  man-machine  interface  to  make  this  connection.  In  a  future  project 
we  will  extend  MOKUM  such  that  a  real  person  can  do  almost  all  that  can  be 
described  in  a  script.  So  real  protection  is  necessary  then. 

In  [Olivier  &  von  Solms,  1994]  the  authors  formulate  a  taxonomy  for  security 
in  0-0  databases.  We  notice  that  many  of  the  different  systems  they  describe 
can  be  implemented  in  MOKUM,  using  the  notion  of  private  attributes  and 
keepers  of  collections,  as  we  have  demonstrated  above. 

4.3  Visibility  and  accessability  in  MOKUM 

In  this  subsection  we  derive  some  theory  about  visibility  and  accessibility  in  a 
MOKUM  program.  A  MOKUM  program  consists  of  types,  attributes,  proce¬ 
dures  and  scripts.  In  procedures  and  scripts  one  can  specify  the  use  of  types, 
object  identifiers  and  attributes.  In  the  rest  of  this  section  we  shall  use  the  term 
'script’  meaning  both  scripts  and  procedures. 

MOKUM  does  not  provide  any  protection  on  the  usage  of  object  identifiers 
and  types.  If  somewhere  in  a  script  the  value  of  an  object  identifier  is  known, 
e.g  because  it  is  the  parameter  of  a  message,  it  can  be  used,  by  asking  its  type 
(type  of  )  and  its  attributes  can  be  manipulated,  provided  the  attributes  are 
available.  .Attributes  declared  private  are  protected  in  the  MOKUM  system. 
All  other  attributes  can  be  seen  and  changed  by  all  objects  in  the  script  of  any 
type  and  no  protection  is  provided. 

Also,  to  make  things  very  simple,  protection  is  full,  i.e.  it  includes  read 
and  write  protection.  In  a  future  MOKUM  system  we  will  build  in  a  more 
differentiated  form  of  protection. 

In  the  following  we  shall  focus  our  attention  on  protection  of  private  at¬ 
tributes.  We  will  define  visibility  and  accessibility. 


A  MOKUM  program  consists  of: 

•  a  set  of  (user-defined)  types,  called  type.set; 
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•  a  set  of  attributes,  called  attr_set; 

•  a  set  of  facts  of  the  form:  attr(T,A,CT)  where  T  in  type_set,  A  in  attr_set 
and  CT  has  the  following  form;  [Case,  Type],  where  Case  =  simple,  or  coll 
and  Type  is  either  elementary  i.e.  int,  real  or  string,  or  is  user  defined: 
Type  in  type  set; 

•  a  set  of  facts  is  a(T.S),  where  T  and  S  in  typeset,  defining  the  usual 
generalization /specialization.  The  resulting  graph  must  be  cycle  free.  The 
meaning  is  that  T  inherits  all  attributes  from  S,  in  particular  the  private 
ones; 

•  a  set  of  facts  of  the  form:  private(A),  for  some  A  in  ATTR. 

In  a  MOKUM  program  the  attributes  must  be  unique. 

We  now  define  visibility:  Non-private  attributes  are  visible  in  all  types;  for 
private  attributes,  the  type  m  which  the  attribute  is  defined  has  evidently  the 
property  that  that  attribute  is  visible.  Moreover,  an  attribute  is  visible  within 
a  type  when  that  attribute  is  visible  in  a  super  type  or  when  that  type  is  the 
type  of  a  collection  keeper,  whose  collection  has  elements  with  a  type  in  which 
that  attribute  is  visible.  In  Prolog,  this  is  defined  very  precisely  as  follows: 

vis(A,T):-  A  in  attr_set,  T  in  type_set,  not  private(A). 
vis(A,T);-  A  in  attr_set ,  T  in  type_set,  private (A) , 

(attr (T , A , ; 

(attr(T,_. ,  [coll,S] )  ;  is_a(T,S)),  vis(A,S) 

). 

Visibility  is  a  property  which  can  easily  be  detected  by  the  compiler  using  the 
Prolog  rule  given  above.  .Accessibility  presupposes  visibility,  i.e.  the  compiler 
has  checked  that  access  to  an  attribute  is  allowed.  The  MOKUM  system  also 
needs  to  check  that  the  proper  object  is  involved.  This  is  evidently  necessary. 
Just  a  check  on  visibility  means  that  in  the  case  where  salary  is  a  private  object 
of  employee,  an  employee  can  change  his  manager’s  salary  when  manager  is  an 
attribute  of  employee. 

For  proper  access  protection  MOKUM  applies  the  following  rule:  suppose 
the  access  involves  an  attribute  of  object  O,  The  caller  object  CO  must  be  the 
same  as  O,  or  must  be  one  of  the  keepers  of  O,  So: 

acc (CO , 0 , A , T) : -  not  private(A),  vis(A,T). 

acc(C0,0,A,T) private(A),  vis(A,T),  (C0=0;  keeper(C0,0)) . 

A  keeper  K  of  an  object  O  of  type  t  is  an  object  of  type  kt  and  attr(kt,a,[colI,t]) 
is  in  the  program,  while  indeed  0  has  been  put  into  K:a.  Evidently,  the  latter 
can  only  be  checked  at  run  time. 

Visibility  is  a  necessary  property  to  check  before  access  to  a  certain  operation 
by  some  object  can  be  allowed.  One  can  see  from  a  simple  example  why  a  run 
time  check  on  the  object's  identity  is  not  sufficient. 

Suppose  in  the  script  part  of  a  person  the  attribute  salary  is  mentioned,  e.g. 
in  a  statement  to  change  it.  Now  salary  is  supposed  to  be  a  private  attribute 
of  employee,  to  be  manipulated  only  by  an  object  of  type  employee  or  the 
keeper  of  employees.  Without  the  visibility  protection  it  would  be  possible  for 
this  person  to  change  his/her  salary  because  evidently  the  object  identity  test 
succeeds.  A  similar  counter  example  where  a  keeper  of  a  collection  is  involved 
is  the  follow'ing:  a  collection  consists  of  persons,  maybe  a  subtype  of  person, 
say  tennis  ..player,  and  the  keeper  of  this  collection  wants  to  know  the  salary  of 
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these  persons,  which  evidently  is  against  the  rules.  Also  in  this  case  a  simple 
object  identifier  test  would  be  insufficient. 

Visibility  can  be  checked  by  the  compiler,  as  has  been  remarked  above,  it 
can  also  be  checked  at  run  time  (actually  the  representation  of  the  MOKUM 
type  specification,  as  generated  by  the  compiler,  looks  quite  similar  to  the  above 
representation).  The  reason  we  let  the  compiler  do  this  is  of  course  efficiency:  in¬ 
stead  of  checking  visibility  every  time  a  private  attribute  is  accessed  this  checking 
is  done  once  by  the  compiler.  When  we  have  extended  MOKUM  with  a  facility 
that  the  ’’real”  person  can  manipulate  his  internal  counterpart  person  object  as 
if  a  script  were  executed,  the  run  time  checking  of  visibility  will  be  built  in  for 
this  case. 

Let  us  now  see  the  protection  theory  applied  to  the  example  above.  We 
have: 


type_set  =  [person,  niil_person,  document,  mil_person_mgr , 
security_off icer] 

attr.set  =  [name,  clearance,  sensitivity,  mil_persons,  documents] 
private (clearance) . 
private (sensitivity) . 
private (mil_persons) . 
private (documents) . 
is_a(mil_person,  person). 
is_a(mil_person_mgr ,  mil.person) . 
is_a(security_officer ,  mil_person) . 
attr(mil_person,  clearance,  [simple , int] ) . 
attr (document ,  sensitivity,  [simple , int] ) . 
attr (document ,  contents,  [simple, string] ) . 
attr(mil_person_mgr ,  mil_persons,  [coll,  mil_person] ) . 
attr(security_officer ,  documents,  [coll,  document]). 

We  shall  now  compute,  using  the  Prolog  system  the  visibility  of  name,  clear¬ 
ance  and  sensitivity,  by  asking: 


vis(name,T)?- 

person,  mil_person,  document,  mil_person_mgr , 
security_off icer . 

vis (clearance ,T) ?- 

mil_person,  mil_person_mgr  (2*),  security_off icer . 

vis (sensitivity ,T) ?- 
document,  security_off icer . 

and  vice  versa  asking  what  is  visible  by  objects  of  type  security -officer, 
mil  person  mgr,  mil  person  and  person: 

vis (A , security.off icer)?- 

neime,  clearance,  sensitvity,  documents. 
vis(A,mil_per8on_mgr)?-  t 

neune ,  clearance  (2*),  mil.persons. 
vis(A,mil_person)?- 

naune,  clearance. 
vis(A,person)?- 
name . 
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For  another  interesting  approach  to  visibility  and  in  particular  authorization 
in  an  0-0  environment  see  [Rabitti,  Bertino,  Kim  &  Woelk,  91]. 

5  Constraints  as  MOKUM  restrictions 

There  is  a  major  problem  attached  to  simply  putting  the  maintenance  of  the 
constraints  in  the  script  part,  namely  the  constraints  have  to  be  coded  by  the 
MOKUM  programmer  in  the  form  of  rather  detailed  MOKUM  instructions.  It 
would  be  nice  that  the  MOKUM  compiler  takes  care  of  some  of  these  constraints. 
In  this  section  we  will  see  how  this  is  done  and  how  far  we  can  go. 

We  will  introduce  the  notion  of  restriction.  A  restriction  is  always  connected 
to  a  type  and  to  some  attributes  of  that  type.  A  restriction  is  furthermore  in¬ 
heritable.  Both  the  compiler  and  the  MOKUM  kernel  have  to  deal  with  a 
restriction.  In  principle,  the  compiler  translates  a  restriction  into  Prolog  code, 
while  the  kernel  is  performing  an  update  after  checking  that  the  particular  Pro¬ 
log  code  is  successfully  executed.  If  not,  the  kernel  refuses  to  carry  out  the 
update.  An  example  of  a  restriction  is  the  following  one  where  a  person’s  wage 
is  related  to  that  person’s  age; 


type  person 

has  _a  name:  string 

has_a  wage:  int 

has  a  age:  int 

restriction  age,  wage:  Restr  prod 

proc 

Restr  _procl:- 

.4  from  age, 

W  from  wage, 

W  <  A. 

endproc 

Each  time  age  or  wage  gets  a  new  value  (operation  0P1.4)  this  restriction  is 
invoked  substituting  the  new  value  into  either  age  or  wage,  the  Prolog  procedure 
Restr  prod  is  invoked  and  it  must  return  successfully. 

With  these  kind  of  restrictions  we  are  able  to  treat  the  constraints  in  cate¬ 
gories  Cl.l  and  Cl. 2  of  section  3.  Denoting  the  old  value  of  an  attribute,  that 
is  the  value  before  the  update,  with  .4  from  attribute,  and  the  new  value  with 
the  attribute  itself,  we  can  relate  old  and  new  values  of  attributes,  e.g.  in  the 
restriction: 


restriction  age:  Restr  proc2 

proc 

Restr  .proc2:- 

N'  =  age,  /*new  value*/ 

0  from  age,  /*old  value*/ 

N  >  0. 

endproc 


Such  restrictions  are  appropriate  for  connection  with  one  attribute  as  only 
one  attribute  can  be  changed  at  a  time.  These  kinds  of  restrictions  presuppose 
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that  there  always  is  an  old  value,  which  sometimes  is  not  the  case  (Operation 
OP1.3).  In  MOKUM  one  can  handle  such  cases  by  first  checking  whether  the 
value  is  available,  as  is  illustrated  in  the  following  example: 


Restr_proc3: 

N  =  age, 

(O  from  age,  !,  N  >  O  ; 

true  /*there  is  no  old  value  for  age*/). 


Constraints  of  category  C2  can  principally  not  be  treated  this  way,  they 
must  be  handled  using  a  script.  Such  constraints  typically  refer  to  more  than 
one  object,  such  as  the  constraint  that  a  spouse  of  a  spouse  of  someone  must 
be  that  someone.  Evidently,  such  a  constraint  can  not  be  maintained  if  the 
registration  of  a  marriage  involves  two  actions:  changing  the  marital  status  of 
the  husband  and  doing  the  same  with  the  wife.  This  typical  constraint  must  be 
defined  in  a  script  of  the  keeper  of  marriages. 

Let  us  now  turn  our  attention  to  the  constraints  of  category  C3.  When  the 
properties  of  the  members  are  not  involved  the  constraint  usually  refers  to  prop¬ 
erties  of  the  set  itself,  e.g.  number  of  elements.  The  collection  keeper  should 
now  have  an  extra  attribute  to  keep  track  of  such  a  property.  Upon  insertion 
and  deletion  of  an  element  of  the  collection  such  an  attribute  must  be  updated 
and  possibly  checked.  With  the  tools  described  above  one  can  have: 


type  empl  adm 

has  a  nr.of  employeesrint 

has  a  max.nr^of  employeesiint 

has  a  employees:  collection  of  employee 

restriction  nr  .of  employees  :  Restr_proc4 
proc 

Restr  proc4:- 

N  from  nr  of  employees,  M  from  max  nr  of  employees, 
N  <  M. 


endproc 

script 


at  trigger  insert  empl:  .  . , 

N  from  nr  of  employees,  N1  =  N'-t-l, 

N1  to  nr  of  employees, 

/*if  successful  then  insert  in  employee  collection*/ 
. . .  into  employees,  . . . 


endscript 


The  constraints  of  category  C3.2,  where  properties  of  the  members  of  a  col¬ 
lection  are  involved,  can  also  be  treated  by  restrictions.  Talce  as  example: 


^ e^employees'  6. age  ^  20 
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This  constraint  requires  that  the  update  of  the  age  attribute  of  some  person 
involves  a  check  whether  that  person  is,  as  an  employee,  member  of  employees. 
Note  that  the  MOKUM  kernel  knows  the  keepers  of  the  collections  to  which 
a  certain  object  belongs;  this  facility  had  to  be  implemented  for  proper  access 
control  anyhow.  The  above  constraint  is  specified  in  the  type  of  the  collection 
keeper,  in  our  example:  empLadm,  as  if  it  were  a  restriction  on  the  elements  of 
the  collection: 


restriction  employees  :  Restr_proc.5 

proc  forall  employees:  Restr_proc5:-  (A  from  age,  A  >  20). 

endproc 

The  compiler  can  see  that  this  restriction  is  actually  a  restriction  on  age  of 
employee  objects  and  translates  this  restriction  in  the  following  one: 


proc  Restr.procfi:-  (A  from  age,  A  >  20).  endproc 


This  restriction  is  then  connected  to  the  age  attribute  in  person  (age  in 
employee  is  inherited  from  person).  The  attribute  age  then  has  two  restric¬ 
tions  connected  to  it:  Restr  prod  and  Restr.procG.  When  the  age  attribute  is 
changed  Restr.procl  is  called  always  and  Restr. proc6  is  called  only  when  the 
object  is  a  member  of  an  employees  collection. 

Also,  when  an  element  is  added  to  the  collection  the  restriction  is  executed 
of  course. 

Constraints  involving  existential  quantifiers  cannot  be  accommodated  by 
restrictions  in  MOKUM.  They  should  be  treated  in  the  script  parts.  The  same 
holds  true  for  constraints  of  category  C3.3. 

Looking  at  the  list  of  operations,  we  see  that  the  restrictions  can  be  combined 
with  the  operations  OP1.3,  OP1.4  and  OP1.5  (there  is  no  difference  between 
ordinary  values  and  object-values  in  MOKUM)  and  OP3.1  and  OPS. 2  (the  set- 
membership  operations).  For  the  other  operations  MOKUM  restrictions  cannot 
be  used. 

For  security  constraints,  involving  the  context,  in  the  form  of  the  invoker 
and  its  environment,  as  introduced  in  section  3.2,  we  see  that  restrictions  could 
be  used,  in  combination  with  a  MOKUM  procedure.  In  the  procedure  we  could 
specify  constraints  on  invoker  and  environment.  However,  this  would  be  very 
unnatural.  A  much  simpler  way  is  to  specify  the  security  constraints  in  the 
script,  as  we  have  shown  at  the  end  of  section  4. 


6  Conclusions 

In  this  paper  we  have  demonstrated  how  integrity  constraints,  referring  in  par¬ 
ticular  to  the  quality  of  the  data,  and  security  constraints,  referring  to  protection 
and  access  rights,  are  treated  in  an  integrated  way  in  the  MOKUM  system.  “It 
is  argued  that  all  constraints,  whether  they  refer  to  attribute  values  of  one  or 
several  objects,  whether  they  refer  to  properties  of  several  collections,  combined 
with  properties  of  their  members,  or  whether  thej?  refer  to  access  rights,  they 
can  all  be  written  as  Prolog  predicates  in  the  form  of  a  script.  Many  of  them 
can  also  be  specified  in  the  form  of  restrictions  on  attributes  and  collections. 
These  are  much  easier  to  read  and  to  specify.  For  a  secure  implementation 
it  turned  out  to  be  necessary  to  build  in  some  low  level  control  performed  by 
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the  MOKUM  kernel,  this  pertained  to  the  checking  of  restrictions  as  well  as  to 
access  control. 

One  can  compare  this  treatment  with  the  one  in  [Bassiliades  &  Gray,  94], 
in  which  the  authors  describe  how  in  their  CoLan  system  all  constraints  are 
translated  into  Prolog  predicates,  which  are  also  connected  to  certain  update 
operations.  No  mention  was  made  in  their  paper,  however,  of  security  and 
protection  rules. 

It  is  our  conviction  that  the  fact  that  in  MOKUM  has  only  two  levels:  objects 
and  types,  specifying  constraints  on  objects  and  on  collections  can  be  done  in 
a  very  direct  way.  It  is  also  possible  to  make  a  compiler  which  takes  some  of 
the  burden  of  the  constraint  specifier  away.  It  would  be  nice  to  see  how  far  we 
can  go  in  building  a  compiler  which  translates  the  constraints  written  in  a  high 
level  language,  like  FOL,  into  the  scripts  we  use  for  MOKUM  objects. 

Another  future  addition  to  the  MOKUM  system  is  that  the  animation  facility 
is  extended  with  a  window  in  which  a  real  person,  as  end  user,  can  interactively 
communicate  with  his  ”own”  object.  The  manipulation  available  for  the  real 
person  comprises  all  the  commands  which  can  be  used  in  the  trigger  part  of  a 
script.  Having  this  facility  makes  it  possible  to  experiment  with  a  knowledge 
base  who  is  willfully  changed  in  unforeseen  ways  in  order  to  demonstrate  that 
its  constraints  and  security  rules  are  sufficiently  strong. 

Another  interesting  problem  is  to  make  a  tool  which  can  check  whether 
constraints  are  inconsistent.  For  example  a  constraint  on  an  object  type  may 
be  in  conflict  with  a  constraint  on  the  elements  of  a  collection  of  objects  of  that 
same  type.  This  is  a  problem  which  in  general  leads  to  undecidability,  but  for 
special  cases,  for  example,  range  restrictions,  is  solvable.  A  similar  problem  can 
be  formulated  for  constraints  concerning  types  and  subtypes.  These  problems 
are  to  be  worked  out  in  the  future. 

Our  group  is  also  working  on  how  to  use  linguistics  in  the  area  of  Modelling 
Information  and  Communication  Systems.  One  of  the  projects  is  how  to  trans¬ 
late  a  high  level  language,  in  which  natural  language  and  language  in  the  form 
of  pictures  and  diagrams  is  mixed,  into  MOKUM  specifications.  This  involves 
also  the  security  and  integrity  constraints  we  discussed  in  this  paper. 
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ABSTRACT 

The  SWORD  secure  DBMS  is  unique  in  that  it  provides  field  level  classifications 
which  are  not  equivalent  to  row  level  classifications.  This  has  a  significant  impact 
upon  the  query  language  used  to  access  the  database.  In  particular,  it  is  necessary  to 
handle  the  results  of  expressions  which  clients  are  able  to  know  exist,  but  are  not 
cleared  to  know  the  actual  value.  Also,  it  is  desirable  to  generate  detailed,  field  level 
information  labels.  The  paper  focusses  on  the  effect  of  these  requirements  on  the 
semantics  of  SWORD's  secure  variant  of  SQL. 

1.  INTRODUCTION 

In  the  SWORD  Secure  Relational  DBMS  [Wood  et  aZ  92],  Information  Flow  is 
controlled  using  field  level  classifications.  This  is  in  contrast  to  the  emerging  Secure 
RDBMS  products,  which  only  add  a  single  classification  per  row.  The  important 
difference  between  the  two  approaches  is  that  with  field  level  classification  a  client  is 
able  to  detect  the  existence  of  data  even  when  their  clearance  is  insufficient  to  allow 
them  to  observe  the  data's  value.  With  such  row  level  classification,  the  only  data  that 
can  be  detected  is  that  which  can  also  be  observed. 

This  paper  describes  the  extended  form  of  SQL,  called  Secure  SQL  (SSQL),  which  has 
been  devised  as  the  query  language  for  SWORD.  The  emphasis  is  on  the  semantic 
problems  associated  with  field  level  classifications,  support  for  trustworthy  clients  and 
providing  detailed  Information  Labels. 

The  addition  of  row  level  classifications  to  a  DBMS  need  have  very  little  impact  on  the 
query  language.  The  essential  addition  is  the  ability  to  use  the  row  label  in 
expressions.  With  SWORD's  field  level  classifications  there  is  a  need  to  handle 
values  which  the  client  is  not  cleared  to  observe,  and  this  makes  a  significant 
difference  to  the  semantics  of  the  query  language. 

Alternative  field  level  classification  schemes  have  been  proposed,  but  these  are 
equivalent  to  row  level  classifications  fQian&Lunt92].  In  these  schemes,  the  existence 
of  a  field  is  hidden  from  a  client  with  insufficient  clearance  to  observe  its  contents,  by 
the  presence  of  a  lowly  classified  field  (which,  if  nothing  else,  contains  a  null).  This 
results  in  polyinstantiation,  with  consequent  loss  of  system  integrity  [Wiseman89]. 
SWORD  deliberately  adopted  the  Insert  Low  approach  [Wiseman90]  to  avoid 
polyinstantiation  and  the  associated  problems. 

Support  for  trustworthy  clients  is  often  provided  by  a  system  of  privileges,  but  these  can 
be  rather  crude  and  difficult  to  manage.  Instead  SSQL  allows  the  text  of  a  query  and  the 
fact  that  it  is  issued  to  be  classified  lower  than  the  client's  clearance,  while  still 
preserving  the  Information  Flow  policy.  Such  support  is  easier  to  use  and  manage  but 
significantly  complicates  the  Information  Flow  constraints  inherent  in  the  semantics, 
though  the  paper  does  not  describe  these  in  any  detail. 

The  Secure  DBMS  products  all  allow  simple  Information  Labels  to  be  attached  to  rows 
retrieved  from  the  database.  SSQL  tries  to  provide  more  detailed  information,  by 
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labelling  both  retrieved  rows  and  their  fields,  and  by  taking  into  account  the  subtleties 
of  complex  predicates.  DBMS  Information  Labels,  which  are  sometimes  called 
Advisory  labels,  are  strongly  related  to  CMW  [Berger  ef  al  90]  "Information  Labels" 
although  the  mechanics  of  their  operation  are  quite  different. 

2.  SECURITY  POLICY 

The  SSQL  security  policy  constrains  information  flows  and  defines  Information 
Labels. 

2.1  Information  Flow  Security 

The  Information  Flow  Security  policy  is  a  strong  statement  about  the  way  in  which 
information  is  permitted  to  flow  through  the  DBMS.  All  inputs  (queries)  are  labelled 
with  classifications  that  indicate  the  sensitivity  of  the  information  conveyed  by  the 
input.  All  clients  are  labelled  with  a  classification,  their  clearance,  which  restricts  the 
information  they  are  permitted  to  observe. 

Roughly,  the  policy,  which  is  called  "No  Flows  Down",  is  that  no  client  may  learn 
anything  about  information  entered  into  the  database,  unless  their  clearance 
dominates  the  classification  given  that  information.  This  can  be  put  more  formally  as 
follows; 

Clients  that  have  insufficient  clearance  to  see  any  difference  between 
two  sequences  of  inputs,  may  not  see  any  difference  in  the  sequence  of 
results  caused  by  those  inputs. 

This  is  a  generalisation  of  the  Non  Interference  property  defined  by  Goguen  and 
Meseguer's  seminal  paper  [Goguen&Meseguer82]. 

2.2  Labelling  Results 

The  result  of  a  query  provides  some  basic  facts  directly,  but  implies  more  facts  by  virtue 
of  the  context  in  which  it  arises.  That  is,  knowing  the  answer  gives  some  information, 
but  knowing  the  question  as  well  gives  much  more.  Generally,  the  direct  facts  are 
classified  lower  than  the  implied  facts. 

In  some  applications,  it  appears  useful  to  know  what  other  clients,  in  terms  of 
clearance,  are  able  to  learn  the  various  facts  encoded  in  a  result.  Thus  results  are 
labelled  to  provide  this  information.  A  formal  statement  of  the  policy  is  given  in 
[Wiseman93]. 

3.  MULTI-LEVEL  QUERIES  AND  RESULTS 

In  SSQL,  tables,  queries  and  the  results  of  a  select  query  are  all  multi-level  entities. 

3.1  Multi-level  tables 

Tables  in  SSQL  are  multi-level  entities,  structured  in  two-dimensions:  row  and 
column.  This  two-dimensional  structure  can  be  represented  as  an  object  hierarchy 
consisting  of  a  number  of  rows  each  with  the  same  number  of  fields. 

Each  table  has  a  classification,  which  is  used  to  protect  the  table's  schema  information. 
The  existence  of  a  table  is  classified  independently  by  the  classifications  in  the 
hierarchical  directory  system  which  is  used  to  name  tables  [Cant  et  al  94],  The 
directory  system  is  an  important  extension  to  SQL  which  allows  tables  to  be  named  in  a 
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similar  way  to  files  in  an  operating  system.  The  directory  structure  allows  tables  with 
sensitive  names  to  be  hidden  inside  highly  classified  directories. 

The  existence  of  a  row  in  a  table  is  given  a  classification,  though  this  row  existence 
classification  must  dominate  the  classification  of  the  table.  In  addition,  no  two  rows  in 
a  table  are  allowed  to  have  incomparable  existence  classifications.  This  constraint 
permits  the  rows  to  be  stored  in  order  of  increasing  existence  classification,  which  is 
necessary  to  avoid  highly  classified  information  from  being  encoded  into  the  time 
taken  to  process  a  query. 

The  existence  of  each  column  in  a  table  is  given  a  classification,  with  similar 
restrictions.  However,  unlike  with  hidden  rows,  the  data  in  hidden  columns  can  be 
processed  without  affecting  query  execution  time  through  the  judicious  use  of  list 
manipulations.  Thus  rows  can  be  inserted  or  deleted,  without  execution  time  revealing 
highly  classified  information,  even  if  the  table  contains  hidden  columns. 

Each  field  in  a  table  is  also  given  a  classification.  This  field  classification  must 
dominate  both  the  row  existence  class  of  the  row  it  is  within  and  the  column  existence 
class  of  the  column  it  is  within.  That  is,  the  object  hierarchy  which  makes  up  a  table  is 
Compatible'  in  the  [Bell&LaPadula76]  sense. 

For  each  column  in  a  table,  there  is  a  row  in  a  data  dictionary  table  which  describes  it. 
If  the  existence  of  a  column  is  to  be  hidden  from  users  with  low  clearances,  then  the 
corresponding  row  in  the  data  dictionary  must  also  be  hidden.  This  is  achieved  by 
classifying  the  existence  of  the  row  to  the  classification  of  the  column  it  describes.  This 
is  the  only  practical  use  of  tables  with  varying  row  classifications  that  has  so  far  been 
identified,  so  the  restrictions  on  their  use  does  not  appear  to  be  a  problem. 

3.2  Multi-level  queries 

Queries  in  SSQL  are  generally  multi-level  entities.  However,  those  clients  which 
cannot  be  trusted  to  maintain  the  separation  between  the  different  elements  of  the 
query,  are  constrained  to  issuing  single-level  queires.  Multi-level  queries  allow 
trustworthy  clients  to  change  the  database  in  ways  which  are  visible  to  clients  with 
lesser  clearance,  while  at  the  same  time  preserving  "No  Flows  Down", 

Within  the  SSQL  syntax,  it  is  only  possible  to  label  expressions.  This  is  achieved  by 
enclosing  the  expression  in  brackets  and  prepending  a  (constant)  classification.  The 
classification  of  the  query  as  a  whole  is  given  by  a  classification  prepended  to  the 
query. 

The  following  query  is  shown  as  an  example.  It  is  given  an  overall  classification  of 
Confidential,  but  part  of  its  where  clause  is  classified  higher,  at  Secret. 

fCl  SELECT  *  FROM  flights 

WHERE  dest  IS  NOT  NULL  AND  [S](  cargo  =  'Bombs'  ) 

This  query  can  be  seen  in  its  entirety  by  clients  with  clearances  of  Secret  or  above. 
However,  clients  with  Confidential  clearance  would  see  a  censored  structure: 

fCl  SELECT  *  FROM  flights 

WHERE  dest  IS  NOT  NULL  AND  [S](  not  cleared  ) 

In  practice,  one  client  is  not  able  to  observe  the  queries  issues  by  another.  However,  it  is 
sometimes  possible  to  infer  details  about  those  queries  by  observing  the  effect  they  have 
on  the  database.  In  these  cases,  SSQL  is  such  that  effects  observed  by  a  client  will  not 
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reveal  anything  about  the  nature  of  parts  of  others'  queries  that  are  classified  higher 
than  the  client's  clearance. 

In  the  case  of  stored  procedures,  a  client  may  execute  a  query  which  is  not  completely 
visible  to  them.  In  this  case,  the  query  will  fail  if  that  part  of  the  query  turns  out  to  be 
relevant  to  the  query's  execution.  For  example,  suppose  a  client  cleared  to  Confidential 
executes  the  above  query.  This  will  successfully  return  no  rows  if  all  rows  in  the  table 
have  a  null  dest  field,  however  it  will  fail  if  any  row  has  a  non-null  dest,  because  it  is 
then  necessary  to  compute  the  'invisible'  condition. 

3.3  Multi-level  results 

The  result  of  an  update,  insert  or  delete  query  is  a  simple  message,  however  select 
queries  yield  Derived  Tables  which  are  themselves  multi-level  entities.  The  structure 
of  a  Derived  Table  is  similar  to  that  of  real  tables,  but  the  labels  are  used  to  convey 
additional  information  about  how  the  result  was  formed.  These  labels  may  be 
'Incompatible',  meaning  that  the  labels  may  decrease  as  the  tree  is  descended.  Hence, 
untrusted  clients  may  usefully  be  given  a  multi-level  result,  but  their  clearance  will  be 
greater  than  all  the  labels  within  it.  The  derivation  of  the  labels  in  Derived  Tables  is 
discussed  in  §6. 

4.  CALCULATING  THE  VALUE  OF  EXPRESSIONS 

In  SSQL,  the  client  does  not  always  have  sufficient  clearance  to  be  able  to  compute  an 
expression.  This  has  a  significant  effect  on  the  semantics  of  expressions. 

4.1  Arithmetic  and  Comparisons 

An  SSQL  query  will  only  be  evaluated  in  terms  of  those  rows  whose  existence  is 
classified  at  a  level  which  is  dominated  by  the  client's  clearance.  In  general,  the  query 
will  have  to  compute  expressions  for  these  rows.  These  expressions  are  used  in  Where 
clauses,  Select  lists  and  so  forth. 

In  some  cases,  the  results  of  these  expressions  will  depend  upon  the  contents  of  fields 
which  are  classified  at  a  level  which  is  not  dominated  by  the  client's  clearance.  This 
means  the  value  of  the  result  cannot  be  revealed  to  the  client,  even  though  the  client  is 
cleared  to  know  that  there  is  such  a  result. 

In  effect,  the  result  of  computing  an  expression  for  a  given  row  is  either  a  value  of  the 
appropriate  type,  or  a  special  value  called  Not  Cleared.  If  the  client's  clearance  permits 
them  to  learn  the  result  of  the  expression,  they  receive  the  proper  value,  while  if  their 
clearance  is  insufficient  they  receive  Not  Cleared.  Note  that  Not  Cleared  is  special 
only  because  it  cannot  be  stored  in  a  field  of  a  table,  even  though  it  can  be  stored  by  the 
client  software  if  required. 

The  simplest  form  of  expression  is  a  constant.  The  result  of  such  an  expression  is  that 
constant,  and  is  always  visible  to  the  client  because  the  client  provided  the  constant  as 
part  of  the  query. 

Another  simple  form  of  expression  is  a  column  name,  the  result  of  which  is  the  value  of 
the  row's  field  in  that  column.  However,  in  SSQL  the  client's  clearance  is  first  checked 
against  the  field's  classification.  If  the  clearance  dominates  the  classification,  the 
result  is  the  field’s  contents,  otherwise  the  true  value  is  ignored  and  the  result  is  Not 
Cleared. 
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For  arithmetic  and  simple  comparisons,  the  result  is  Not  Cleared  if  any  of  the 
arguments  are  Not  Cleared.  Some  examples  are  shown  in  the  following  table. 


> 

1 

2 

Not  Cleared 

1 

False 

False 

Not  Cleared 

2 

True 

False 

Not  Cleared 

Not  Cleared 

Not  Cleared 

Not  Cleared 

Not  Cleared 

Actually,  SSQL  is  slightly  over-strong  in  this  respect  because  it  is  sometimes  possible  to 
give  a  correct  answer  even  when  one  argument  is  Not  Cleared.  For  example,  the 
smallest  integer  is  less  than  or  equal  to  all  other  such  integers  of  the  same  precision. 
Thus  a  query  language  with  the  following  behaviour  for  16  bit  two's  complement 
integers  would  still  be  secure: 


> 

-32768 

2 

Not  Cleared 

-32768 

False 

False 

False 

2 

True 

False 

Not  Cleared 

Not  Cleared 

Not  Cleared 

Not  Cleared  Not  Cleared 

However,  SSQL  does  not  provide  for  such  special  cases  as  any  implementation  would  be 
rather  inefficient  and  the  relaxation  does  not  seem  particularly  worthwhile. 

4.2  Field  &  Row  Labels 

In  SSQL  it  is  possible  to  use  the  labels  attached  to  fields  and  rows  as  expressions  of  type 
Class.  If  the  rows  are  from  a  joined  Derived  Table,  it  is  possible  to  select  the  row  labels 
of  the  individual  tables  that  contribute  to  the  join.  The  syntax  is  as  follows: 

CLASS  OF  column  name  -  -  label  of  a  f  eld 

CLASS  OF  ROW  -  label  of  row 

CLASS  OF  ROW  OF  table  name  -  label  of  contribution  to  joined  row 


A  client  who  has  sufficient  clearance  to  detect  the  existence  of  a  vow,  is  always  able  to 
observe  the  row's  existence  classif  cation  and  the  classifications  of  its  fields.  Although 
some  applications  may  not  wish  this  information  to  be  so  freely  available,  it  is  not 
precluded  by  "No  Flows  Down",  When  necessary,  stricter  control  can  be  achieved 
through  the  judicious  use  of  triggers  ILewis  et  al  92]. 

The  emerging  secure  DBMS  products  only  support  row  labels,  and  all  essentially  treat 
this  as  an  additional  column  of  the  table.  Unfortunately,  this  approach  does  not  scale  to 
field  level  labelling  and  so  SSQL  uses  a  different  form  of  concrete  syntax.  Another 
advantage  of  the  SSQL  syntax,  is  that  it  requires  no  additional  reserved  words.  This  is 
because,  in  SQL,  <ident>  OF'  is  not  legal  syntax. 

4.3  AND/OR  Predicates 

Boolean  expressions,  which  are  called  predicates  in  standard  SQL,  can  often  be 
computed  accurately  even  when  one  argument  is  Not  Cleared.  For  example,  when 
False  is  ANDed  with  either  True  or  False,  the  result  is  False.  So  if  the  client's 
clearance  is  sufficient  to  determine  that  one  argument  is  False,  the  result  can  be 
determined  without  examination  of  the  other  argument,  even  if  it  is  Not  Cleared. 

One  further  complication  in  SQL  is  that  truth  values  are  three-valued,  since  a  predicate 
may  evaluate  to  Null  (called  unknown  in  the  standard).  Thus  in  SSQL,  truth  values 
are  four-valued,  since  predicates  evaluate  to  False,  True,  Null  or  Not  Cleared.  This  is 
shown  in  the  following  truth  tables: 
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AND 

False 

True 

Null 

Not  Cleared 

False 

False 

False 

False 

False 

True 

False 

True 

Null 

Not  Cleared 

Null 

False 

Null 

Null 

Not  Cleared 

Not  Cleared 

False 

Not  Cleared 

Not  Cleared 

Not  Cleared 

OR 

False 

True 

Null 

Not  Cleared 

False 

False 

True 

False 

Not  Cleared 

True 

True 

True 

True 

True 

Null 

False 

True 

Null 

Not  Cleared 

Not  Cleared 

Not  Cleared 

True 

Not  Cleared 

Not  Cleared 

The  behaviour  of  AND  and  OR  predicates  with  respect  to  Not  Cleared  values  provides 
some  very  important  functionality.  When  updating  a  table,  the  Where  clause 
expression  is  used  to  determine  whether  a  particular  row  should  be  modified  (see  §5). 
However,  if  computing  the  Where  clause  results  in  Not  Cleared  for  any  row,  the  query 
is  abandoned  with  no  rows  being  modified.  If  the  client  wishes  to  update  those  rows 
which  meet  the  Where  clause  criteria,  except  where  the  clearance  is  insufficient  to 
determine  this,  it  is  necessary  to  augment  the  Where  clause  as  follows; 

UPDATE . WHERE  condition  AND  clearance  is  high  enough  to  compute  condition 

As  a  concrete  example,  consider  the  following  table  and  update  query  issued  by  a  client 
with  a  clearance  of  Confidential: 

Payload _ 


Id 

Weight 

123 

lUI 

42 

IC] 

456 

lUI 

42 

[SI 

789 

lUl 

0 

ICI 

UPDATE  Payload  SET  Id  =  0 

WHERE  Weight  >  10  AND  CLEARANCE  DOM  CLASS  OF  Weight 

With  the  first  row,  the  client  is  able  to  observe  the  Weight  and  hence  can  determine  that 
the  Where  clause  expression  is  True.  Similarly,  with  the  third  row,  the  client  is  able  to 
determine  that  the  Where  clause  is  False.  However,  with  the  second  row,  the  client  is 
not  able  to  observe  the  Weight,  hence  the  first  part  of  the  Where  clause  is  Not  Cleared, 
but  the  second  part  of  the  Where  clause  yields  False.  Thus  the  entire  clause  is  False  and 
the  row  is  not  selected  for  update.  Without  this  second  part,  the  Where  clause  would 
yield  No!'  Cleared  and  the  whole  update  would  fail. 

Such  Where  clauses  are  particularly  useful,  but  adding  the  correct  clearance  check  is 
tiresome  and  error  prone,  particularly  when  AND  and  OR  predicates  are  nested.  So,  to 
make  things  easy,  SSQL  introduces  two  new  monadic  boolean  operators  (predicates), 
called  DEFINITELY  and  POSSIBLY.  These  are  defined  by  the  following  truth  table: 


X 

DEFINITELY  x 

POSSIBLY  X 

False 

False 

False 

True 

True 

True 

Null 

Null 

Null 

Not  Cleared 

False 

True 

The  example  query  given  above  can  now  be  restated  using  the  DEFINITELY  operator: 
UPDATE  Payload  SET  Id  =  0  WHERE  DEFINITELY  Weight  >  10 
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fi;-  -  Pt  JsSIBI  .i'  f.pf'rator  woiild  lie  used  whpp  Romfi  action  is  required  if  a  condition  can 
tie  seer'  to  be  True  or  the  client's  clearance  iS  insufficient  to  compute  it 

4,4-  Set  Functions 

SQL  provides  a  variety  of  set  functions  i sometimes  called  aggregate  functions)  such  as 
■Sum  and  Maximum,  These  are  applied  to  a  multi-set  of  values  which  are  obtained  by 
e-'mluating  an  expression  for  a  number  of  rows.  Optionally,  duplicates  may  be  removed 
irom  the  multi-set  before  the  function  is  computed  In  SSQL  additional  set  functions  are 
provided,  including  set  forms  of  Least  Upper  Bound  and  Greatest  Lower  Bound  on 
ciass!fic.ation> 

t  o;  SkSQi..  ■  esidt  of  one  ct  thc'-  f-  o  >  bin'  i’ons  Alot  iileared  if  anv  of  tiie  i/alues  in 
th-'  m'shi  -'ie;  IS  Nn*  Cleared 

bcn-o'i-T'-n  SQs  iS«c  includes  .iddiLonal  set  -ailed  COUNl  Phis  has  two 

io-ms  mtn  lvii  ;  I'hich  courts  the  :  ■'  -owp  ni  tivo  set  being  considered,  and 

OOi  NT  rdSTINt’T  exp)  which  couni-'  the  namber  of  rows  that  give  a  distinct,  non 
nuli  answer  for  fhe  given  expression 

t  iT-  Hbty'  i-he  <  'fb, '  NT(*)  functin:!  is  slraightfor-vard.  in  that  it  simply  provides  a  count 
of  I  he  niimher  c<  rows  in  the  --r-t  being  >-onsidered  Tho  set  will  only  ever  contain  rows 
whose  exisr.ence  is  classified  with  a.  classification  that  is  dominated  by  the  client's 
eicaranr-e  Thus  the  client  is  always  cleared  to  detect  all  the  rows  and  hence  may 
alw.av  ;  learn  their  number, 

in  .SSQL  the  Distinct  form  of  Uount  is  more  interesting  because  it  includes  an  implied 
equality  te.st  and  test  for  Null  If  any  rows  are  such  that  the  expression  yields  Not 
Cleared ,  it,  is  not  possible  to  determine  whether  the  ro'v  should  be  discounted,  either  on 
the  grounds  of  giving  Null  or  being  a  duplicate  value.  So.  if  the  client's  clearance  is 
insufficient  to  compute  the  expression  for  any  of  the  rows  being  considered,  the  overall 
result  of  th'-  fount  Distinct  is  Not  Cleared. 

The  fel lowing  table  shows  the  results  of  computing  various  set  functions  against  the 
’■ow:-  in  the  Pavloari  table  for  clients  with  (wo  different  clearances: 


OUENT  (d  .EARANCE 

SUM!  Weight. 

COUNT! 

COUNT(DISTINCT  Weight) 

ici 

Nf-t  Ch'arrd 

3 

Not  Cleared 

!Si 

8-1 

42 

4  Qnant’fiofi  Predicates 

in  standard  SQL,  a  Quantified  Predicate  is  nnc'  where  an  expression  is  compared 
against  each  of  the  values  in  a  one  column  wide  Derived  Table,  All  the  usual 
ir'thmetK  comparisons  may  he  made  and  there  are  two  forms  of  the  predicate;  one 
which  gives  whether  some  selected  value  meets  the  condition  and  one  which  gives 
whether  they  all  flo 


In  effect,  such  predicates  are  a  serie.s  of  tests,  AMDed  together  in  the  case  of  the  ALL 
form  and  ORed  together  in  the  case  of  the  SOME  form.  Hence  in  SSQL  the  rules  of  AND 
and  OR  with  respect  to  Not  Cleared  values  applies, 

4.6  Exceptions 

Ihe  SQL  standard  does  not  detail  how  arithmetic  exceptions  such  as  overflow  should  be 
handled.  In  a  secure  environment,  such  exceptional  behaviour  could  be  the  source  of 
illicit  information  flows,  so  it  is  important  to  be  clear  about  the  effects. 


The  following  fragment  of  an  embedded  SSQIVAda  program  shows  how  inappropriate 
exceptional  behaviour  could  be  exploited  by  a  client  to  obtain  the  (integer)  value  of  a 
field  which  has  a  classification  not  dominated  by  the  client's  clearance: 

declare 

SSQL  DECLARE  STATEMENT  st  AS 

SELECT  Weight  +  :P  FROM  Payload  WHERE  Id  =  456; 

SSQL  DECLARE  CURSOR  c; 

SSQL  DECLARE  VARIABLE  res  :  Integer; 

SSQL  DECLARE  VARIABLE  try  :  Integer  :=  Integer’last;  -  start  with  largest 
begin 
loop 

SSQL  OPEN  CURSOR  c  FOR  st  USINCx  try  FOR  p 
begin 

SSQL  FETCH  FROM  c  INTO  res 
exit,  -  weight  +  try  =  integer'last 
exception 

when  ssqljnterface.overflow  =>  null:  --  weight  +  try  >  integer'last 
end: 

try  :=  try  -  1; 
end  loop 

text_io.put_line(  "weight  =  "  &  integer'image(integer'last-try)  ); 

end 

This  repeatedly  attempts  to  add  some  constant  to  the  quantity  field  of  a  particular  row. 
The  constant  starts  off  as  large  as  possible,  and  is  gradually  decreased.  At  first  an 
arithmetic  overflow  will  occur  during  query  processing,  but  this  is  handled  by  the  loop 
and  the  search  continues.  Eventually  the  constant  is  made  small  enough  so  that  an 
overflow  does  not  occur.  At  this  point  a  forced  exit  from  the  loop  is  made  and  the  value 
in  the  field  can  be  computed. 

SSQL  avoids  such  problems  by  introducing  exceptions  as  special  values.  With  respect 
to  clas,sification  checks,  these  exceptions  are  treated  like  any  other  data.  In  effect,  the 
clearance  check  occurs  before  the  arithmetic  takes  place  The  following  table,  which 
gives  some  example  ressults  for  addition  of  16  bit  twos  complement  integers,  shows  this: 


+ 

-32768 

1 

32767 

Not  Cleared 

32768 

Overflow 

■32767 

■1 

Not  Cleared 

1 

-32767 

2 

Overflow 

Not  Cleared 

32767 

-1 

Overflow 

Overflow 

Not  Cleared 

Not  Cleared 

Not  Cleared 

Not  Cleared 

Not  Cleared 

Not  Cleared 

For  comparisons  the  result  is  an  exceptional  value  if  any  of  the  arguments  are 
exceptional  values,  but  for  ANDs  and  ORs  an  exceptional  value  as  an  argument  need 
not  lead  to  an  exceptional  result,  as  the  following  table  illustrates: 


X 

y 

x  AND  y 

X  ORy 

True 

Exception 

Exception 

True 

False 

Exception 

False 

Exception 

Not  Cleared 

Exception 

Not  Cleared- 

Not  Cleared 

5.  WHERE,  GROUPBY  AND  HAVING  CLAUSES 

An  SQL  query  contains  three  clauses,  Where,  Groupby  and  Having,  which  are  used  to 
identify  the  rows  that  are  to  be  selected,  updated  or  deleted. 


The  Where  clause  is  processed  first  It  consists  of  a  predicate  (boolean  expression) 
which  is  evaluated  for  each  of  the  rows  in  the  table  being  processed.  If  the  expression 
vields  False  or  Null  for  a  particular  row,  that  row  is  removed  from  future 
consideration.  The  default  Where  clause  is  the  constant  True. 

The  tTrouphy  clause,  which  consists  of  a  set  of  column  names,  is  now  applied.  The  rows 
being  processed  are  divided  into  the  minimum  number  of  groups,  such  that  the  rows  in 
a  particular  group  all  have  the  same  values  in  the  fields  of  the  specified  columns.  If  no 
Groupby  clause  is  specified,  all  the  rows  are  formed  into  one  large  group. 

Finally,  the  Having  clause  is  applied.  This  is  a  predicate  which  is  evaluated  for  each 
group  The  expression  must  evaluate  to  the  same  value  for  each  row  in  a  group, 
otherwise  the  query  fails  completely.  Any  group  for  which  the  Having  clause  is  False 
or  Null  IS  now  discarded  en  Tnanse.  The  default  Having  clause  is  the  constant  True. 

The  mam  effect  of  the  Groupby  clause  is  on  set  functions.  For  a  particular  row,  a  set 
function,  applies  to  all  the  rows  in  that  row’s  group.  In  SSQL.  Set  functions  in  the  Where 
clause  apply  to  all  the  rows  in  the  table  being  processed. 

In  HSQI,.  lor  Update  and  Delete  queries  and  Select  queries  that  occur  inside  other 
queries,  the  client  clearance  must  he  sufficient  to  compute  the  value  of  the  Where  clause 
for  each  and  evmn’  row  in  the  table.  If  the  result  of  the  Where  expression  for  any  row  is 
Not  Cleared .  or  is  .some  exception,  the  query  fails.  For  Select  queries  at  the  outer  level, 
such  rovis  are  ignored  and  the  select  continues,  although  the  client  is  able  to  ask 
whether  this  has  occurred. 

Processing  the  Groupby  clause  is  in  effect  a  number  of  equality  tests  Thus  the  client 
must  be  cleared  to  observe  all  the  data  in  the  columns  mentioned  in  the  Groupby  list.  If 
this  were  not  so  for  some  row,  it  would  not  be  possible  to  decide  in  which  group  the  row 
belongs  Hence,  if  the  classification  of  any  of  the  data  in  the  Groupby  columns  is  not 
dominated  by  the  client's  clearance,  the  query  fails. 

SSQL  also  allow's  rows  to  be  grouped  according  to  the  classification  of  fields  in  a 
column.  However,  a  client  is  always  able  to  observe  the  field  classifications  in  any  row 
that  they  are  cleared  to  know  exists.  Hence  it  is  always  possible  to  form  groups  on  this 
basis  The  following  example  illustrates  how  the  syntax  of  Groupby  clauses  has  been 
extended  to  allow  this: 

.SF.LECT  .AVG< weight!  FROM  Payload  GROUP  BY  CLASS  OF  Weight 
The  rules  for  processing  the  Having  clause  are  simdar  to  those  for  the  Where  clause. 

6.  DERWED  TABLES 

A  Derived  Table  can  be  formed  by  extracting  data  from  a  stored  table  or  by  combining 
ot'hp'  Derived  Tables  in  a  variety  of  ways. 

6  1  Field  Classifications 

The  field  classifications  of  a  Derived  Table  are  effectively  Information  Labels,  which 
are  the  minimum  clearance  a  client  can  possess  such  that  it  is  still  possible,  using 
some  query  or  other,  to  learn  the  facts  derived  from  the  fields. 

The  Information  Label  given  for  a  constant  is  always  Unclassified  (lattice  bottom). 
This  IS  because  all  clients  know  about  all  constants,  so  anyone  can  construct  a  constant 
expression. 


9 


For  a  column  name  expression,  the  field's  classification  is  used  as  the  Information 
Label.  This  is  because  the  field  classification  is  the  lowest  clearance  a  client  can  have 
if  they  are  to  learn  the  field's  value. 

The  Information  Label  generated  for  arithmetic  operations  and  comparisons  is  given 
by  the  least  upper  bound  of  the  arguments'  Information  Labels.  This  is  because  a  client 
must  be  cleared  to  know  the  value  of  all  arguments  in  order  to  learn  the  result. 

Deriving  the  Information  Label  generated  for  an  AND/OR  predicate  is  more  complex. 
This  is  because  the  result  of  such  an  expression  can  sometimes  be  known  even  when 
some  arguments  are  not  known.  As  the  following  tables  show,  the  Information  Label  is 
usuall)'  the  least  upper  bound  of  the  arguments'  Information  Labels.  However,  if  the 
result  of  an  AND  is  False,  then  knowing  ju.st  one  of  the  False  arguments  is  enough  to 
learn  the  answer  Hence  the  Information  Label  should  be  that  of  the  'lowest'  False 
argument  Similarly  for  OR. 


AND 

OR 

False 

lowest'  False  argument 

False 

lub  arguments 

True 

lub  arguments 

True 

lowest'  True  argument 

Null 

lub  arguments 

Null 

lub  arguments 

Not  Cleared 

lub  arguments 

Not  Cleared 

lub  arguments 

Unfortunately,  security  classifications  are  not  always  comparable,  so  there  may  not  be 
a  lowest',  In  the  case  where  the  Information  Labels  of  all  the  False  arguments  are 
comparable,  the  Information  Label  of  the  AND's  result  is  simply  the  greatest  lower 
bound  of  the  False  arguments'  Information  Labels.  When  they  are  incomparable,  there 
is  no  one  correct  Information  Label,  This  is  because  there  are  several  possible 
'minimum'  clearances  which  are  sufficient  to  learn  the  result. 

Rather  than  support  multiple  Information  Labels,  which  is  just  too  difficult  to 
implement  and  does  not  seem  very  useful,  the  incomparable  Information  Labels  are 
combined  in  the  following  way.  First  the  labels  are  partitioned  into  the  minimum 
num.ber  of  sets  of  labels  which  are  comparable.  The  greatest  lower  bound  from  each  set 
IS  taken  and  then  the  least  upper  bound  of  these  is  used  as  the  result's  Information 
Label.  Although  this  operation  sounds  complex,  an  iterative  bit-pattern 
implementation  is  quite  straightforward. 

For  arithmetic  set  functions,  such  as  SUM  and  AVG,  the  Information  Label  is  the  least 
upper  bound  of  all  the  Information  Labels  of  the  expressions  calculated  for  each  row. 

rhe  set  function  COUNT(*)  returns  the  number  of  rows  in  the  table  (strictly,  the 
current  group  of  the  table).  The  result  is  completely  unrelated  to  the  contents  of  the  rows, 
but  reveals  information  about  the  existence  of  each  row.  Hence,  the  Information  Label 
is  the  least  upper  bound  of  the  existence  classifications  of  each  row. 

For  COUNTIDISTINCT  exp)  the  expression  is  calculated  for  each  row,  and  rows 
giving  duplicate  or  null  values  are  ignored.  Thus  the  result  depends  on  both  the  values 
of  the  expressions  and  the  existence  of  the  rows.  Hence,  the  Information  Label  is  the 
least  upper  bound  of  the  Information  Labels  for  the  expressions  and  the  existence 
classification  of  each  row  (note  that  for  Derived  Tables,  the  row  existence 
classification  may  strictly  dominate  the  expression's  Information  Label). 

For  quantified  predicates,  the  Information  Label  is  derived  according  to  the  rules 
given  for  AND/OR  predicates,  though  the  existence  classification  of  the  rows  is  also 
taken  into  account. 
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8  2  Row  Classifications 


When  a  Derived  Table  is  described  by  a  Select  query,  the  presence  of  a  row  is 
determined  by  the  query's  WTiere  clause.  Thus  the  row's  Information  Label  is  a 
classification  which  is  the  minimum  clearance  needed  to  discover  that  the  Where 
clause  expression  is  True.  This  classification  may  be  higher  than  that  needed  to 
compute  the  selected  expressions.  Thus,  the  Information  Label  attached  to  a  row  need 
not  be  dominated  by  the  Information  Labels  on  the  fields. 

In  general,  the  row's  existence  is  also  influenced  by  the  Groupby  and  Having  clauses 
and  the  row  classification  is  actually  the  least  upper  bound  of  the  classifications  needed 
to  evaluate  these  clauses  as  well. 

7.  SUMMARY 

This  paper  has  described  Secure  SQL,  a  variant  of  standard  SQL89  which  has  been 
extended  to  support  the  field  level  classifications  provided  by  the  SWORD  Secure 
Relational  DBMS.  The  language  also  provides  detailed  Information  Labels  and 
allows  trustworthy  clients  to  work  freely  within  the  bounds  of  their  clearance  while 
still  enforcing  No  Flows  Down". 
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Abstract 

In  this  paper,  we  make  two  contributions  related  to  transaction  management  in  multi¬ 
level  secure  distributed  databases.  First,  we  present  a  secure  locking  protocol  (SLP)  that 
provides  different  degrees — 0,  1,  2,  and  3  -of  isolation.  Next,  we  present  a  secure  early 
prepare  commit  protocol  (SEP)  that  not  only  preserves  atomicity  of  distributed  trans¬ 
actions,  but  can  be  integrated  wit.h  SLP  without  violating  the  isolation  requirements  as 
well.  Both  SLP  and  SEP  take  advantage  of  the  i.solation  degree  of  the  transaction  being 
executed.  For  degrees  0,  1,  and  2,  SLP  is  free  of  starvation,  and  SEP  requires  only  4n 
messages  (where  n  is  the  number  of  sites  participating  in  the  commit  protocol).  For  degree 
3  isolation,  SLP  may  suffer  from  starvation,  although  the  probability  of  starvation  is  quite 
small;  and  SEP  may  sometimes  require  more  than  4n,  but  never  more  than  6n  messages. 
We  .suggest  a  way  to  reducing  this  additional  cost  in  messages  using  synchronized  clocks. 

Keyword  Codes:  D.4.6;  H.2.0;  11.2.4 

Keywords:  Security  and  Protection,  Information  Systems,  General;  Systems;  Multilevel 
Security;  Atomic  Commit,  Isolation,  Concurrency  Control;  Serializability 

1.  INTRODUCTION 

In  a  distributed  database  (DDB),  there  are  several  logical  objects,  which  are  physically 
loca.ted  at  different  sites  or  nodes.  A  distributed  transaction,  though  initiates  at  one  site, 
may  require  to  access  objects  stored  at  remote  sites.  To  guarantee  correct  executions  of 
distributed  transactions,  each  site  in  the  DDB  is  equipped  with  a  concurrency  control 
protocol  and  a  commit  protocol.  There  are  several  different  concurrency  and  commit 
protocols  with  two-phase  locking  (2PL)  and  basic  two-phase  commit  (2PC)  as  being  the 
most  well-known  concurrency  and  commit  protocols,  respectively. 

Although  most  conventional  (single-level)  commercial  systems  use  locking  based  mech¬ 
anisms  for  concurrency  control,  they- do  not  always  use  2PL.  They  offer  different  degrees 

*The  work  of  E,  Bertino  was  carried  out  while  visiting  George  Mason  University  during  summer  1993. 
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of  isolation- 1),  1,  2,  and  3— to  transactions  (  GLPT76,  GR93];  each  transaction  has  the 
option  of  specifying  the  degree  of  isolation  required.^  The  reason  for  providing  different 
options  is  that  transactions  that  specify  lower  degree  of  isolation  require  fewer  and  shorter 
duration  locks,  leading  to  improvement  in  the  amount  of  concurrency  and  response  time. 

Moreover,  most  commercial  systems  (e.g.,  IBM’s  LUG. 2  and  Digital’s  DECdtm)  use 
the  early  prepare  commit  protocol  (EP)  rather  than  2PC  as  the  commit  protocol.  This  i  j 
because  these  systems  do  not  support  request /response  paradigm,  thus  an  implementation 
of  2PG  in  these  systems  requires  6n  messages  (where  n  is  the  number  of  sites  participating 
in  the  commit  protocol)  compared  to  4n  messages  required  by  EP. 

Tn  keeping  with  the  existing  practices  in  commercial  systems,  we  first  offer  in  this 
paper  a  secure  (distributed)  locking  protocol  (SLP)  that  provides  all  four  degrees  of 
isolation  A  nice  property  of  SLP  is  that  it  is  not  subject  to  starvation  for  degree  0. 

1,  oi  2  isolation.  Although  SLP  may  suffer  from  starvation  for  degree  3  isolation,  we 
show  that  the  probability  of  starvation  is  quite  small.  The  significance  of  this  result 
IS  that  locking,  which  is  the  universally  accepted  mechanism  for  concurrency  control  in 
conventional  databases,  can  be  used  in  multilevel  secure  databases  as  well,  provided  we 
are  willing  to  tolerate  a  small  probability  of  starvation. 

Next,  we  give  a  secure  analog  of  EP,  called  SEP,  that  not  only  preserves  atomicity,  but 
requires  only  4n  messages  for  degree  0,  1,  or  2  isolation.  For  degree  3  isolation,  it  may 
sometimes  require  more  than  4n  messages,  but  it  never  requires  more  than  6n  messages. 
We  suggest  a  way  to  reducing  this  additional  cost  in  messages  using  synchronized  clocks. 
We  also  show  that  SLP  and  SEP  can  be  easily  integrated  to  provide  the  desired  degree 
of  isolation  as  well. 

This  paper  is  organized  as  follows.  We  begin  in  section  2  with  a  review  of  related  work. 
In  section  3.  we  present  our  distributed  multilevel  secure  (MLS)  DDB  model.  In  section 
4,  we  describe  the  lock-based  protocol,  SLP,  which  provides  different  degrees  of  isolation. 
Since  SLP  is  susceptible  to  starvation  for  degree  3  isolation,  subsection  4.1  contains  a 
probability  model  showing  that  the  probability  of  starvation  is  quite  small.  In  section  5, 
we  describe  SEP  for  different  degrees  of  isolation.  Since  SEP  sometimes  requires  extra 
rounds  of  messages  for  degree  .3  isolation,  section  6  describes  an  optimization  of  SEP, 
called  03SEP,  which  reduces  this  extra  number  of  messages.  Finally,  section  7  presents 
conclusions. 

2.  RELATED  WORK 

Although  most  of  the  research  in  MLS  transaction  management  has  focused  on  cen¬ 
tralized  databases,  there  has  been  some  recent  activity  that  deals  with  DDBs.  In  [ 
.TM93,  JMB93],  Jajodia,  McCollum,  and  Blaustein  study  the  secure  analogs  of  2PL  and 
commit  protocols.  They  modify  2PL  to  give  a  secure  2PL  protocol  (S2PL)  that  yields 
degree  3  isolation.  S2PL,  like  our  SLP,  is  susceptible  to  starvation. 

[  ,1MB93]  also  shows  how  EP,  2PC,  and  some  optimizations  of  2PC  (e.g.,  presumed 
commit)  can  be  modified  to  be  secure.  While  their  modifications  to  2PC  and  presumed 
commit  yield  a  protocol  that  can  be  integrated  with  S2PL  without  any  violation  to  global 

t  Degree  0  is  also  known  as  chaos,  degree  1  as  browse,  degree  2  as  cursor  stability,  and  degree  3  as  repeatable 
reads  or  serializable. 
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•  onsistf-ric'/  their  modifications  to  EP  cannot  guarantee  the  global  consistency  of  the 
distributed  data,  when  used  in  conjunction  with  S2PL.  In  this  paper,  we  give  a  different 
modification  to  EP  that  ensures  global  consistency  of  data. 

3-  THE  DISTRIBUTED  SYSTEM  MODEL 

^  distributed  sy.stem  consists  of  a  set.  of  nodes  iV,  w'here  each  node  Y,  G  N  is  an 
MLS  DBMS,  The  va.rious  nodes  in  the  distributed  system  are  connected  via.  communica¬ 
tion  links.  We  assume  that  these  communication  links  are  tamper  proof  Each  node  is 
supported  Hv  a  Trusted  Computing  Base  (TCB  j.  which  is  responsible  for  mediating  all 
dat.abasp  accesses  and  cannot  be  bypassed.  Each  node  also  has  its  own  local  transaction 
maria, gei  (LTMi  and  distributed  transaction  manager  (DTM).  The  DTM  acts  as  an  in- 
.  rrfa,ce  betiveen  the  L'TM  and  distribiitefl  transactions  (those  that  originate  at  the  site  or 
are  remior.ci. 

3.1  The  MI/S  DBMS  model 

We  model  the  MLS  DBMS  as  a  quadruple  -c,  I).1\S,L  >,  where  D  is  the  set  of  data 
items  i,  ryhipcis).  T  i.s  the  set  of  transaction!;  {subjects),  .S  is  the  partia,lly  ordered  set  of 
access  classes  (or  security  levels)  with  an  ordering  relation  <,  and  X  is  a  mapping  from 
DUT  to  .b  For  every  x  £  D,  L(x)  t  and  for  every  T,  G  T,  L(Tj)  G  S.  In  other  words, 
every  data  item  as  well  as  every  transaction  has  a  security  class  associated  with  it. 

We  extend  the  mapping  L  sucli  i  hat  it  maps  each  MLS  DBMS  Y,-  to  an  ordered  pair 
of  .securitv  classes  X„itn{Y,)  and  /-'mox(Y.  )  Clearly,  it  should  always  be  the  case  that 
bTTunl  Y',  !  <  Lma;.-(iYj)  and  Xmax(YV)- /'/,)  G  S.  in  other  words,  every  MLS  DBMS  in 
the  distributed  system  has  a  range  of  security  levels  associated  with  it.  For  every  data 
item  .'c  stored  in  an  MLS  DBMS  <  L(x)  <  L^axi^i)-  Similarly,  for  every 

transaction  T,  executed  at  Y,-,  <  X(T(  )  <  Xmax(Y,).  A  node  is  allowed  to 

communicate  with  another  node  Nj  only  if  Lraaxi^\)  —  l‘max{Nj).  The  reader  may  refer 
to  f  ,TM93i  for  additional  details  on  distriiuited  MLS  DBMS  model. 

Our  securitv  policy  i.s  based  on  the  Bell  LaFhndiila  model  [  BL76].  According  to  this 
model,  the  following  two  conditions  are  necessary  for  a.  system  to  be  secure: 

9  A  tran.sartion  7  ,  is  allowed  to  read  a  da.ta  element  x  only  if  L(x)  <  L(Tj). 

»  A  tiansaction  T,  i.s  allowed  to  write  a  data  element  x  only  if  L{x)  =  L{Tj). 

I'he  .second  property,  which  allows  I  rams;; c l  ions  to  write  only  at  its  level,  is  a  restricted 
version  o:'  the  *- jiroperty.  The  original  *-propfirty  proposed  in  the  Bell-LaPadula  model 
allowcs  transactions  to  write  into  levels  above  their  security  level.  How'ever,  in  the  databcise 
context,  it  seems  prudent  to  disallow  transactions  that  write  to  higher  levels  [  JK90]. 

In  addition  to  tliese  two  requirements,  a  secure  sy.stem  must  guard  against  illegal  in¬ 
formation  flows  through  signaling  and  covert  channels. 

3.2,  The  distributed  transaction  model 

When  a  transaction  T,  is  submitted  to  the  DTM  of  a  local  MLS  DBMS,  if  T,  is  a  local 
transaction,  DTM  simply  passes  it  to  the  LTM.  V.Ten  T,  is  distributed,  DTM  assumes 
the  role  of  the  coordinator  and,  thus,  is  responsible  for  resolving  references  to  the  data 
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items  accessed  by  that  transaction  so  as  to  determine  where  these  data  items  are  located. 
DTM  then  generates  the  subordinate  processes  (from  now  on  we  refer  to  these  as  simply 
subordinates).  A  distributed  transaction  7’,  can  originate  at  an  MLS  DBMS  Nk  where 
Nk  c  N  if  LminiNk)  <  L(T,)  <  Lmax(Nk).  A  subordinate  Ti^k  ^'^an  execute  at  Nk  if 

Irmni^k)  <  L(T,,k)  <  Lraax(-Nk)-^ 

Each  subordinate  T,-,*  at  node  Nk  inherits  the  same  security  level  as  that  of  Tf  — 

L{7.,).  Even  if  a  subordinate  is  a  read  onhr  transaction  that  reads  data  items  below  the 
security  level  of  the  originating  transaction,  it  is  executed  as  a  read-down  transaction 
at  the  corresponding  subordinate  node,  its  clearance  level  being  same  as  that  of  the 
originating  transaction.  Every  subordu  ate  Tt^k  is  subjected  to  the  security  resti  ctions 
described  in  the  earlier  subsection.  These  when  combined  with  the  range  constraints 
result  ill  the  following  conditions  on  subordinates.  A  subordinate  Tgt  can  execute  at  node 
i\k  only  if 


®  whenever  Tg*,  wishes  to  read  an  item  x,  LminiNk)  <  L{x)  <  L{Ti),  and 
•  whenever  T^^k  wishes  to  write  an  item  x,  X(a;)  =  L{Ti)  <  Lmax{Nk). 

3.3.  The  transaction  model 

We  model  a  transaction  Ti  (either  a  local  transaction  or  a  subordinate)  as  a  sequence 
of  read  and  write  operations  on  data  items.  We  use  r,[x]  and  in,' [a:]  to  denote  the  read 
and  write  operations  issued  by  a  transaction  T,  on  a  data  item  x. 

Sometimes  transactions  acquire  a  lock  before  they  read  or  write  an  item.  They  acquire 
a  share  lock  (S-lock)  for  reading  and  an  exclusive  lock  (X-lock)  for  writing  an  item.  All 
locks  acquired  by  a  transaction  must  eventually  be  released. 

Definition  1  T  wo  locks  o,[.r]  and  Oj[.r]  are  compatible  if  i  =  j  or  neither  of  them  is  an 
X-lock,  □ 

An  S  lock  is  compatible  with  other  S-locks  on  an  item  and,  therefore,  multiple  trans¬ 
actions  can  hold  an  S-lock  on  an  item  at  the  same  time.  However,  only  one  transaction 
can  hold  an  X-lock  on  an  item  at  any  given  time. 

Definition  2  A  transaction  is  well-formed  with  respect  to  writes  if  it  acquires  an  X-lock 
before  writing  a  data  item.  A  transaction  is  well-formed  if  it  acquires  an  S-lock  (X-lock) 
before  reading  (writing)  a  data  item.  □ 

Definition  3  A  transaction  is  two-phase  vnth  respect  to  writes  if  it  does  not  acquire  any 
X-lock  after  it  releases  an  X-lock  on  a,  data  item.  A  transaction  is  two-phase  if  it  does 
not  acquire  any  more  locks  once  it  releases  a  lock  on  a  data  item.  □ 

3.4.  Degrees  of  Isolation 

We  present  below  the  definitions  for  degree  0,  1,2,  and  3  isolation  proposed  by  Gray 
et  al.  in  [  GLPT76]. 

^We  use  Ti^k  to  denote  the  subordinate  of  1}  at  node  Nk- 
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Definitioii  4 

Degree  0.  A  transaction  observes  degree  0  isolation  if  it  is  well-formed  with  respect 
to  writes. 

Degree  1;  A  transaction  observes  degree  1  isolation  if  it  is  well- formed  with  respect 
to  writes  and  also  two-phase  v.dth  respect  to  writes. 

Degree  2^  A  transaction  observes  degree  2  isolation  if  it  is  well-formed  and  two-phase 
wd.h  respect  to  whites. 

Degree  3:  A  transaction  observes  degree  3  isolation  if  it  is  well-formed  and  two- 

phase, 

□ 

4.  SECURE  LOCKING  PROTOCOL  FOR  DIFFERENT  DEGREES  OF  ISO¬ 
LATION 

For  the  distributed  executions  of  transactions  to  be  correct,  it  is  necessary  to  ensure 
atoimcii  v,  consistency,  isolation,  and  durability  [  GR93].  Isolation  means  that  the  system 
gives  every  transaction  the  illusion  it  is  being  executed  alone  all  by  itself,  although  other 
transactions  are  run  concurrently  in  the  system. 

While  total  isolation  is  desirable,  most  commercial  systems  offer  different  degrees  of 
isolation,  viz.,  degree  0,  1,  2  and  3.  In  SQL2,  a  transaction  can  specify  its  desired  degree 
of  isolation,  declaring  the  degree  of  sharing  it  can  tolerate.  Lower  degrees  of  isolation 
improve  the  performance  of  the  system,  though  achieved  at  the  expense  of  consistency. 

intuitively,  with  degree  0  isolation,  a  transaction  is  not  allowed  to  update  a  data  item 
while  another  transaction  is  updating  it,  Witli  degree  1  isolation,  a  transaction  cannot 
overwrite  a  data  item  until  the  completion  of  the  transaction  which  wrote  it  earlier.  With 
rjpgree  2  isolation,  a  transaction  can  read  data  items  only  from  committed  transactions. 
Degree  3  provides  complete  isolation  to  transactions.  By  choosing  this  degree  of  isolation, 
a.  transaction  must  wait  to  either  read  or  vTite  a  dat.a.  item  for  the  completion  of  all  other 
1  rausactions  that  read  or  wrote  it 

t  he  stringent  requirements  imposed  by  multilevel  security  dictate  modification  of  the 
ior.king  protocols  given  in  the  previous  chapter.  An  ideal  locking  protocol  in  a  multilevel 
secure  database  management  system  must  possess  the  following  key  properties: 

•  Provide  the  desired  degree  of  isolation  for  transactions 

•  Preserve  security  (i.e.,  it  must  obey  Bell-LaPadula  restrictions  and  be  free  of  sig¬ 
naling  channels) 

•  Be  impiementable  with  untrusted  code 

•  Be  free  of  starvation.  Starvation  may  occur  because  high  level  transactions  may  be 
sub  jelled  to  indefinite  delays  or  suspended  repeatedly  by  low  level  transactions  to 
prevent  signaling  channels. 
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Since  transactions  can  specify  their  desired  degree  of  isolation,  in  this  section,  we  pro¬ 
pose  a  secure  locking  protocol  that  gives  degree  0,  1,2,  and  3  isolation.  This  protocol  is  a 
modified  version  of  the  protocol  given  in  [  GLPT76,  GR93]  so  that  it  meets  the  security 
requirements. 

Algorithm  1  [Secure  Locking  Protocol  (SLP)] 

•  For  degree  0  isolation, 

-  Whenever  a  transaction  or  a  subordinate  wishes  to  write  a  data  item  x,  it  must 
first  acquire  an  X-iock  on  x  before  writing  x. 

A  transaction  or  a  subordinate  T,  releases  the  X-lock  on  a  data  item  x  when 
the  write  operation  is  completed. 

•  For  degree  1  isolation, 

-  Whenever  a  transaction  or  a  subordinate  wishes  to  write  a.  data  item  x,  it  must 
first  acquire  an  X-lock  on  x  before  writing  x. 

—  A  transaction  or  a  subordinate  7]  releases  all  X-locks  only  when  it  commits  or 
aborts. 

-  A  transaction  or  a  subordinate  cannot  acquire  any  more  X-locks  once  it  releases 
an  X-lock. 

•  For  degree  2  isolation, 

~  All  transactions  and  suliordinates  are  well-formed.  That  is,  whenever  a  trans¬ 
action  or  a  subordinate  wishes  to  read  (write)  a  data  item  x,  it  must  first 
acquire  an  S-lock  (X-Iock)  before  reading  (writing)  x. 

A  transaction  or  a  subordinate  T,  releases  all  X-locks  only  when  it  commits  or 
aborts.  However,  1\  must  release  an  S-lock  as  soon  as  the  corresponding  read 
operation  is  completed. 

-  A  transaction  or  a  subordinal. e  cannot  acquire  any  more  X-locks  once  it  releases 
an  X-lock. 


•  For  degree  3  isolation, 

-  .All  transactions  and  subordinates  are  well-formed.  That  is,  whenever  a  trans¬ 
action  or  a  subordinate  wishes  to  read  (write)  a  data  item  x,  it  must  first 
acquire  an  S-lock  (X-lock)  before  reading  (writing)  x. 

—  A  transaction  or  a  subordinate  T,  releases  all  locks  only  when  it  commits  or 
aborts.  However,  7]  must  release  an  S-lock  on  a  data  item  x  whenever  another 
transaction  Tj  requests  an  X-lock  on  x  such  that  L{Tj)  <  L(T,).  In  such  an 
event,  Ti  is  aborted. 

—  A  transaction  or  a  subordinate  cannot  acquire  any  more  locks  once  it  releases 
a  lock. 

□ 
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It  13  nriportant  to  note  that  SLP  is  single-level,  and  therefore,  can  be  implemented  with 
antnisted  code.  However,  since  the  SLPs  at  a  i  security  levels  use  a  common  lock-table, 
'he  lock-manager  must  be  trusted 

Por  degree  2  isolation,  w'henever  a  low  trarsaction  requests  an  X-lock  on  a  data  item 
while  a  high  transaction’s  read  operation  cn  the  same  data  item  is  being  executed,  the 
tigh  tT-ansa.ct.on  i:-  not  required  to  release  its  S-Jock  to  accommodate  the  write  request  by 
the  low  tran  =^act3on  This  is  because  we  assume  that  the  read  operation  is  instantaneous 
and  thus  does  not  cause  any  deh. in  processing  the  low  transaction’s  write  request,  and 
tiierefore,  does  not  introduce  a  signaling  cliann  1,  in  other  words,  we  make  the  assumption 
that  the  three  actions-  acquiring  an  S-lock.  reading  the  data  item  and  then  releasing  the 
S  lock  are  executed  instantaneously.  This  is  not  an  unreasonable  assumption  since  SLP 
for  degree  2  isolation  does  not  require  the  S-locks  to  be  two-phase;  the  S-locks  are  short 
duration  locks  and  are  released  as  soon  as  the  read  operation  is  performed. 

On  the  other  hand,  for  degree  3  isolation,  whenever  a  low  transaction  requests  an 
X-fock  on  a  data  item  on  v  hich  a  high  transaction  already  has  an  S-lock,^  the  high 
transaction  releases  its  S-lock  and  thereby  allows  the  low  transaction  to  proceed  with  its 
write  operation.  Otherwise,  the  lower  level  transaction  would  need  to  wait  for  the  release 
of  this  S-lock  by  the  high  transaction.  This  situation  can  be  exploited  by  two  colluding 
transactions  at  levels  high  and  low  to  establish  a  signaling  channel.  To  prevent  such  illicit 
flow  of  information,  a  secure  system  must  prioritize  lower  level  transactions  over  their 
higher  level  counterparts  while  allocating  the  locks  In  this  process,  some  transactions 
may  get  aborted  repeatedly,  resulting  in  starvation. 

To  reduce  the  amount  of  starvation,  Jajodia  and  McDermott  [  MJ92]  propose  a  variation 
of  this  approach.  According  to  this  variation,  whenever  a  high  transaction  prematurely 
releases  its  S-lock  on  a  low  data  item  due  to  security  reasons,  it  does  not  abort  or  roll¬ 
back  entirely.  The  high  transaction  continues  to  hold  its  X-locks  on  high  data  items, 
marks  the  low  data  item  in  its  private  workspace  as  unread  and  retries  reading  this  data 
item  by  entering  into  a  queue.  This  queue  maintains  the  list  of  all  high  transactions 
waiting  l  o  reread  that  particular  data  item,  and  enables  the  first  transaction  in  the  queue 
to  be  serviced  first.  Though  transactions  are  not  two-phase,  this  approach  guarantees 
senalizability.  We  refer  the  reader  to  [  M,I92]  for  additional  details  and  for  the  proof  of 
correctness 

Theorem  1  If  a  transaction  observes  the  SLP.  then  any  legal  history^  H  will  give  that 
transacfion  degree  1,  2,  or  3  isolation,  as  long  as  other  transactions  in  H  are  at  least 
degree  1 . 

ProoL  Proof  is  .similar  to  the  one  given  in  [  GLPT76,  pages  384-386].  □ 

4.1,  Probability  of  starvation 

A  very  simple  model,  adapted  from  [  GR93],  is  provided  to  estimate  the  probability 
of  starvation.  Suppose  that  the  database  has  R  data  items.  Moreover,  suppose  that 

-Recall  that  high  transactions  are  allowed  to  read,  but  not  write  data  at  lower  levels.  Therefore,  a  high 
transaction  will  never  be  able  to  acquire  an  X-lock  on  a  lower  level  data  item. 

history  is  said  to  be  legal  when  no  two  incompatible  locks  on  an  item  are  simultaneously  held  by 
transactions  in  that  history. 
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there  is  a  high  transaction  that  wishes  to  read  I  low  data  items  and  that  there  are  n 
iow  transactions,  each  modifying  r  low  data  items.  These  transactions  are  all  running 
concurrently. 

The  probability  that  the  high  transaction  must  release  a  lock  on  a  low  data  item, 
because  a  low  transaction  is  modifying  it,  is  approximately  P  =  (nr)/R.  Note  that,  as 
pointed  out  in  [  GR93],  nr  <  <  R.  i.c  ,  most  data  items  are  unlocked  most  of  the  time. 
The  probability  that  the  high  transaction  is  aborted  because  one  of  its  locks  on  the  low 
data  items  has  been  released  early  is  given  by; 

TA  -  i-(i-py 

cx  IP 

Ini' 

~R 

The  high-order  terms  can  be  dropped  because  nr  «  R  and,  therefore,  P  «  1. 

The  probability  that  the  high  transaction  is  aborted  again,  after  being  restarted  the 
first  time,  is  given  by  TA^.  Tlierefore,  the  probability  that  a  transaction  is  aborted  n 
times,  due  to  early  release  of  locks,  is  given  by 


Sliue  ill  most  cases  the  nuiiil)er  of  data  items  locked  by  the  transactions  is  a  small 
subset  of  the  total  data  items,  expression  (★)  rapidly  decreases  with  the  increase  in  n.  As 
an  example,  consider  the  casr  of  a  database  containing  1,000,000  data  items,  with  100 
low  *,ran.sartions,  each  modifying  100  data  d,enis.  Moreover,  consider  a  high  transaction 
that  reads  25  low  data  items  (i.e.  '25%  of  the  data  modified  by  a  iow  transactions).  The 
probability  that  the  transaction  is  aborted  once  is  1/4,  and  the  probability  that  it  is 
aborted  twice  is  1/16,  which  is  already  a  quite  small  probability. 

5  SECURE  EARLY  PREPARE  PROTOCOL  FOR  DIFFERENT  DEGREES 
OF  ISOLATION 

In  this  section,  we  present  SFIP  that  takes  into  consideration  different  degrees  of  iso¬ 
lation.  We  assume  that  SLP  is  being  used  as  the  concurrency  control  algorithm  by  all 
LTMs 


Algorithm  2  [Secure  Early  Prepare  (SEP)] 

When  a  user  who  is  logged  on  at  security  level  s  initiates  a  distributed  transaction  T,- 
at  a  node  Nj,  the  user  must  specify  p's  degree  of  isolation.  The  DTM  at  Nj  acts  as  the 
coordinator  for  T),  and  initiates  the  first  phase  of  SEP. 
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The  pr  :  ire  phase: 

•  F ■  0  coordinator  generates  siFbordinates  ,  T,^2,  •  •  ?  and  sends**  them  to  the 
nodes  Ni.  A/;  Nn,  respecrivo' The  coordinator  also  sends  t  ie  secnniy  level, 
vhu’h  is  s,  and  tlie  isolation  (legree,  v/hich  is  same  as  the  isolation  degree  specified 
{i.i-  1]  by  the  user,  with  each  snisordinate. 

9  The  DTIVl  at  each  node  /V*,.  k  =  1. - i,  hands  T,,*,  its  security  level,  and  its 

degree  of  isolation  to  LTM.  fyi  is  execute  '  by  the  LTM,  taking  into  consideration 
the  isolation  level  of  Ti^k- 

For  degree  0  isolation,  Ll'M  acquirer  ars  X  lock  for  ';ach  item  before  it  is  v/ritten 
by  7, j,  These  locks  are  short  locks  meaning  that  they  can  be  released  as  soon 
the  write  has  taken  place.  DTM  sends  a  yes  vote  to  the  coordinator  if  T,^k 
successfully  completes  its  execution;  it  sends  a  no  vote  otherwise. 

For  degree  1  isolation,  LTM  acquires  an  X-lock  for  each  item  before  it  is  written 
by  T.r  These  locks  are  long  locks  meaning  that  they  must  be  held  until  Ti^k 
commits  at  the  end  of  the  decision  phase,  DTM  sends  a  yes  or  no  vote  to  the 
coordinator  depending  on  whether  or  not  T,.*,  completes  successfully. 

For  degree  2  isolation,  LTM  acquires  an  S-Lock  (X-lock)  on  an  item  before  it 
is  read  (written)  by  S-locks  are  short  locks,  while  X-locks  are  long.  DTM 
sends  a  yes  or  no  vote  to  the  coordinator  depending  on  whether  or  not  Ti^k 
complete.s  successfully. 

-  For  degree  3  isolation.  LTM  acquires  an  S-Lock  (X-lock)  on  an  item  before 
it  is  read  (written)  by  T,,jt.  S-locks  as  well  as  X-locks  are  long  locks.  If 
completes  successfully,  DTM  augments  its  yes  vote  to  the  coordinator  with 
a  read-low  indicator.  A  one-bit  read-low  indicator  is  added  whenever  T,\fc  has 
read  an  item  from  a  lower  level.  DTM  sends  a  no  vote  if  LTM  cannot  commit 

lU- 

A  s  ibordinate  that  sends  an  yes  vote  to  commit  is  said  to  be  in  a  prepared  state. 
The  decisi  m  phase: 

•  SuDpc  .  ike  rr/i,;  kuator  receives  yes  votes  from  all  its  subordinaies. 

‘  degree  e  1.  or  2  isolation,  the  coord! nator  commits  T,  and  then  sends 
>  ommit  messages  to  all  its  subordinates, 

-  l  or  degree  3  isolation,  there  are  two  cases.  If  no  subordinate  has  read  data  f  om 
ower  levels,  the  coordinator  commits  T,  and  then  sends  commit  messages  to 
all  its  subordinates.  On  the  other  hand,  an  extra  round  of  messages  is  required 
between  the  <  oordinate  and  all  those  subordinates  Nj  that  sent  the  read-low 
indicator  with  their  yes  vote. 

ItWe  ass'  ! UP  T,  is  decomposed  into  n  subordinates,  and  T,j,  the  subordinate  at  the  originating  node  Nj 
!s  one  an','  '  ig  them. 

**  We  us  bold  letters  to  indicate  messages. 
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*  The  coordinator  sends  to  each  Nj  a  confirm  message  to  confirm  the  com¬ 
mit. 

+  If  Nj  has  not  released  its  S-locks  on  any  lower  level  data  item  during  the 
time  it  has  been  in  the  prepared  state,  it  responds  with  a  confirmed 
message;  otherwise,  it  sends  a  not-confirmed  message. 

+  If  the  coordinator  receives  a  confirmed  message  from  all  N/s  to  which 
the  coordinator  has  sent  the  additional  round  of  messages,  then  it  sends 
commit  messages  to  all  its  subordinates;  otherwise  it  sends  abort. 

If  the  coordinator  receives  at  least  one  no  vote  or  if  it  times  out  waiting  for  a  vote, 
it  aborts  the  transaction,  and  sends  abort  messages  to  all  subordinates. 

•  Each  subordinate  is  committed  or  aborted  according  to  the  message  received,  and 
then  an  acknowledgment  of  this  fact  is  sent  back  to  the  coordinator. 

•  ,A.fter  receiving  the  acknowledgment  from  all  the  subordinates,  the  coordinator  ter¬ 
minates  Ti-  ^ 

5.1.  Discussion 

If  we  compare  the  number  of  messages  required  by  SEP  to  that  required  by  EP,  EP 
always  requires  about  4n  messages  (where  n  is  the  number  of  subordinates),  while  SEP 
requires  4n  messages  for  degree  0,  1,  and  2  isolation,  but  sometimes  requires  more  than 
4n  messages  for  degree  3  isolation.  An  extra  round  of  messages  is  required  between  the 
coordinator  and  those  nodes  where  the  subordinates  have  read  data  from  the  lower  levels. 
This  IS  because  degree  3  isolation  in  the  distributed  setting  requires  that  not  only  each 
subordinate  must  be  two-phase,  but  the  distributed  transaction  as  a  whole  must  be  two- 
phase  as  well  (see  [  JMB93,  Lom93,  ML086]).  A  simple  way  to  ensure  this  is  to  require 
that  all  S-locks  be  held  until  the  commit  of  the  transaction.  Unfortunately,  we  cannot 
impose  such  a  restriction  in  a  multilevel  secure  environment;  a  subordinate  must  release 
its  S-lock  on  a  data  item  whenever  a  lower  level  transaction  requests  an  X-lock  on  the 
same  data  item 

Note  that  this  release  of  locks  docs  not  cause  any  violation  of  the  degree  3  isolation 
requirements  if  it  occurs  while  the  suliordinate  is  still  being  executed  (i.e.,  before  the 
subordinate  enters  the  prepared  statod.  In  such  a  case,  the  subordinate  can  be  either 
aborted  or  reexecuted.  However,  if  the  release  of  locks  occurs  while  the  subordinate  is  in 
the  prepared  state,  the  subordinate  can  be  neither  aborted  nor  started  over  again.  We 
illustrate  this  further  by  way  of  an  example. 

Let  Ti  and  T2  be  two  distributed  transactions  as  follows: 

Ti  =  ri[a-]ri[y]u;i[2:],  L{Ti)  =  high 
T2  —  u;2[x]i02[i/], 

Suppose  Ti  is  initiated  at  node  N^-,  and  is  initiated  at  node  Nt-  Furthermore,  assume 
that  data  item  x  is  stored  at  node  Na,  y  is  stored  at  and  z  is  stored  at  Nc-  The 
coordinators,  Nc  and  Nbi  generate  the  subordinates,  and  send  them  to  the  corresponding 
remote  nodes.  Accordingly,  Nc  divides  T\  into  three  subordinates,  Ti,a?  S'Hd  T\^cy  ^nd 
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theii  Rends  7'i,a  and  Ti,*  to  Na  and  Nb,  respectively.  Similarly,  Nt,  upon  dividing  T2  into 
two  subordinates  ;'/2,a  and  T2,bi  sends  T2^a  to  A'o- 

The  execution  of  these  subordinates  at  each  of  the  nodes  may  result  in  a  distributed 
history  O,  as  shown  in  figure  1.  At  A'a,  the  following  sequence  of  events  takes  place:  After 
successful  execution,  the  subordinate  T\,a  votes  yes  and  enters  the  prepared  state.  At 
this  point,  the  low  level  subordinate  Ts  a  arrives.  T\^a,  releases  its  S-lock  on  x,  enabling 
T:i,„  to  acquire  an  X-lock  on  x.  At  Nt,  the  subordinate  Ti,6  is  successfully  executed  after 
the  commit  of  T?,;,.  At  Nc,  Ti  is  committed  after  the  coordinator  receives  the  yes  vote 
from  all  the  subordinates. 


Nr. 


high.  |.rj  prepared 

Ti  releases  locks 


low. 


A;. 

coordinator  of  T'^ 


2/. 


^’212'] 


Nr 

coordinator  of  Ti 


Wi[Z\ 


Figure  1  The  distributed  history  D 


Clearfy,  the  distributed  history  D  is  not  serializable  since  Tj  is  serialized  before  T2  at 
Na,  while  the  serialization  order  is  reversed  at  node  Nb.  A  moment  of  reflection  shows 
that  this  inconsistency  arises  because  the  distributed  transaction  Ti  is  not  two-phase, 
although  it  is  well-formed. 

It  is  easy  to  see  that  SEP  manages  to  avoid  the  above  problem.  During  the  prepare 
phase  of  the  protocol,  Ti_o  and  Ti^b  send  read-low  indicators  with  their  yes  votes.  As  a 
result .  during  the  decision  phase,  coordinator  Nr  sends  a  confirm  message  to  both  Na  and 
N'b.  Since  l\_a  has  released  its  S-lock  on  x.  Na  responds  with  a  not-confirmed  message, 
and  /¥(,  resjionds  with  a  confirmed  message  since  Ti^b  has  not  released  its  shared  lock 
on  the  low  data  item  y  during  the  prepared  state.  Because  Nc  receives  a  not-confirmed 
message,  d  decides  to  abort  Tj  and  informs  all  the  subordinates  of  the  outcome.  Thus, 
our  protocol  is  able  to  avoid  the  problem. 

5.2.  Proof  of  correctness 

Theorem  2  Suppose  that  every  DT.M  uses  SEP  for  atomic  commitment  of  distributed 
transactions  and  that  every  LTM  uses  SLP  for  concurrency  control.  Then  any  legal  history 
H  consisting  of  local  and  distributed  transactions  wdll  give  transactions  degree  1,  2,  or  3 
isolation,  as  long  as  other  transactions  are  at  least  degree  1. 

Proof:  Suppose  Ti  is  a  transaction  in  H.  We  prove  this  theorem  in  both  cases  where  T 
is  a  local  transaction  and  T,-  is  a  distributed  transaction. 

First,  suppose  that  T,  is  a  local  transaction  in  H.  Then  T,  is  given  degree  1,  2,  or  3 
isolation,  as  long  as  other  transactions  in  H  are  at  least  degree  1.  This  is  because,  in  such 
a  case,  this  theorem  reduces  to  theorem  1. 
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Next,  suppose  Tj  is  a  distributed  transaction  in  H  initiated  at  node  Nj.  Assume  that 
r,  requires  to  be  executed  at  n  nodes.  The  DTM  at  Nj  generates  r,-,25  •  •  •  and 
sends  them  to  their  respective  nodes  N\,N2,  ■  ■  Nn- 

We  prove  this  theorem  in  the  following  three  cases. 

C;ase  1:  First  we  show  that  if  71-  specifies  its  degree  of  isolation  as  degree  1,  then  it  is 
given  degree  1  isolation  as  long  as  other  transactions  in  H  are  at  least  degree  1. 

Every  FTM  ai  nodes  A^i,  A‘2  •  'W  observe  the  SLP  for  degree  1  isolation  to  schedule 

the  operations  of  .  -  T,.„.  In  other  words,  every  subordinate  of  T,-  is  well-formed 

with  respect  to  writes  and  two-phase  with  respect  to  writes.  According  to  definition  4,  T, 
is  given  degree  1  isolation. 

Case  2;  In  this  case,  we  show  that  if  1)  specifies  its  degree  of  isolation  as  degree  2,  then 
it  is  given  degree  2  isolation  as  long  as  other  transactions  in  H  are  at  least  degree  1. 

Every  LTM  at  nodes  Ni,N2  ■  ■  observe  the  SLP  for  degree  2  isolation  to  schedule 
the  operations  of  T,,i,  r,-,2 .  - .  In  other  words,  every  subordinate  of  Ti  is  well-formed 
and  two-phase  with  respect  to  writes.  According  to  definition  4,  T,  is  given  degree  2 
isolation. 

Case  .3;  In  this  case,  we  show  that  if  1)  specifies  its  degree  of  isolation  as  degree  3,  then 
it  is  given  degree  3  isolation  as  long  as  other  transactions  in  H  are  at  least  degree  1. 

In  this  case,  we  need  only  to  argue  that  every  distributed  transaction  T,  observing  SEP 
protocol  for  degree  3  isolation  is  always  two-phase  and,  therefore,  T,-  sees  degree  3  isolation 
in  //. 

According  to  SEP  for  degree  3  isolation,  a  transaction  or  subordinate  must  release  an 
S-lock  on  a  data  item  if  some  other  lower  level  transaction  or  subordinate  requests  an 
X-lock  on  the  same  data  item 

Suppose  none  of  the  subordinates  of  T,  release  their  S-locks.  In  such  a  case,  we  can 
trivially  see  that  the  distributed  transaction  T,  is  well-formed  and  two-phase  and  therefore 
Tj  is  given  degree  3  isolation 

Now  suppose  that  at  least  one  of  the  subordinates  of  T,-  reads  a  low  data  item  x  and 
releases  its  S-lock  on  x.  If  one  of  the  subordinates  of  T,,  say  releases  the  S-lock  on 
•T  before  entering  the  prepared  state,  then  by  voting  no  it  chooses  to  abort  T,-.  On  the 
other  hand,  suppose  Ti^k  releases  its  S  lock  on  x  during  its  prepared  state.  According  to 
SEP,  since  the  the  DTM  at  A,  receives  a  read-low  indicator,  it  sends  an  additional  round 
of  message  to  A^..  Since  T,,/;  responds  with  a  not-confirmed  message,  T,  is  aborted. 
Therefore,  whenever  any  of  the  subordinates  of  T,  releases  its  S-lock  during  the  prepared 
state  of  the  SEP  protocol,  gets  aborted.  In  other  words,  SEP  allows  a  distributed 
transaction  TJ  to  commit  only  if  all  its  subordinates  continue  to  hold  all  their  respective 
locks  until  r,’s  commit.  Since  T,  is  well-formed  and  two-phase,  according  to  definition  4, 
T,  is  given  degree  3  isolation. 

6.  AN  OPTIMIZED  DEGREE  3  SECURE  EARLY  PREPARE  (03SEP) 

For  degree  3  transaction,  SEP  as  described  above  has  two  drawbacks.  In  addition  to 
sometimes  requiring  more  than  4n  messages,  it  is  overly  pessimistic.  Any  high  subordinate 
that  reads  low  data  is  labortedi  if  any  of  its  S-locks  on  the  low  data  are  broken  while  it 
waits  for  the  confirm  message  from  the  coordinator.  Thus,  SEP  aborts  a  transaction  if 
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i.herf  ir!  a  possibility  of  a  violation  of  the  two-phase  requirement.  It  is  entirely  possible 
that  the  transaction  is  two-phase,  even  though  some  of  the  S-locks  are  broken. 

Wp  :  an  improve  in  both  these  areas  if  we  assume  that  clocks  in  the  distributed  system 
are  synchronized. This  is  not  an  unreasonable  assumption  [  Lis91].  Using  tim.e  services 
sudi  as  network  time  protocol  [  Mil90]  or  Digital  tiime  service  [  Dig89],  it  is  possible  to 
ha  r  distributed  clocks  that  are  synchronized  within  a  millisecond,  “even  after  extended 
periods  when  synchronization  to  primary  reference  sources  has  been  lost”  [  Mil90]. 

In  th's  section,  we  propose  an  optimization  to  SEP  for  degree  3  isolation,  called  03SEP, 
thai  uses  the  synchronized  clocks  03SEP  uses  the  clocks  to  isolate  the  exact  situation  in 
which  tiie  two-phase  requirements  are  violated.  This  optimization  reduces  not  only  the 
numlier  of  messages,  but  the  number  of  transactions  being  aborted  as  well. 

In  the  ()3SEP  protocol,  we  maintain  for  every  transaction  the  time  at  which  each  lock 
has  i>eeri  granted.  We  also  maintain  the  early  lock  release  tune  for  each  transaction  if  the 
transaction  premat.iirciy  releases  its  S-  lock  on  a  lower  level  data  item  in  order  to  accom¬ 
modate  H  lower  level  transaction  We  present  next  the  necessary  notation  that  will  be 
used  m  i.tiis  protocol, 

Notation:  fiiwm  a  subtransaction  denotes  the  time  at  which  the  last  lock  is 

obtained  by  T, Given  a  subtransaction  Tjj  that  has  read  a  low  data  and  voted  yes, 
denotes  the  time  when  the  yes  reply  is  sent.  Moreover,  given  a  subtransaction  Ti^j  that 
has  read  a  low  data  and  voted  yes  but  later  forced  to  prematurely  release  some  locks, 
f  jjg  time  of  the  first  early  release  of  lock  has  occurred. 

Given  a  transaction  Ti,  let 

•  maximiim{Z|°J*'  |  7’j  is  a  subtransaciion  of  Ti}. 


•  -■  minimum{Z-j |  T',,,  is  a  snbtransaction  of  Tp  such  that  has  read 

low  data  aud  has  released  some  lock  early}. 


in  o-lher  words,  denotes  the  time  of  tise  latest  lock  acquired  by  transaction  T.-, 

whereas  n.'?n  denotes  the  time  of  the  earliest  lock  release  performed  by  a  subordinates 
that  lias  read  low  data.  □ 


Algorithm  3  [Optimized  Degree  3  Secure  Early  Prepare  (03SEP)] 

When  a  user  who  is  logged  on  at  security  lev'el  s  initiates  a  distributed  transaction  T,- 
at  a  node  A',,  the  user  must  specify  T.’s  degree  of  isolation.  The  DTM  at  Nj  acts  as  the 
coordinator  for  I}-,  and  initiates  the  first  phase  of  03SEP. 


^^Note  that  the  improvement  in  the  message  cost  is  without  considering  the  cost  incurred  in  maintaining 
a  synchronized  clock. 
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The  prepare  phase: 


•  The  coordinator  generates  the  subordinates,  Ti,i,  T,-,2,  •  •  •  T,-,„  and  sends  them  to 
the  nodes  Ni,N2,...Nn,  respectively.  The  coordinator  also  sends  to  each  of  the 
subordinate  nodes  the  security  level  .s  of  each  L{Ti^k),  A:  =  1, 2, ...  n. 

•  The  DTM  at  each  Nk  executes  Ti_k  as  a  transaction  of  level  s  and  responds  with  a 
vote  as  follows: 

Ti  sends  a  yes  vote  and  if  the  subordinate  successfully  completes  its  execution, 
otherwise  it  sends  a  no  vote. 

If  the  subordinate  that  votes  yes  has  read  a  data  item  x  such  that  L{x)  <  L{Ti^k), 
then  the  subordinate  includes  in  its  vote  message,  in  addition  to  the  vote,  one  bit 
read-low pr,^k)  indicator  (that  indicates  that  it  has  read  low  data)  and 


The  decision  phase: 

•  If  the  coordinator  receives  at  least  one  no  vote  or  it  times  out  waiting  for  a  vote,  it 
aborts  the  transaction  and  sends  abort  message  to  all  its  subordinate  nodes. 

The  coordinator  commits  Ti  if  all  subordinate  nodes  voted  yes  and  the  coordinator 
has  not  received  read-low (Ti^k )  from  any  Nk-  It  then  sends  commit  messages  to  all 
its  subordinate  nodes. 

If  the  coordinator  receives  a  read-low (Ti^k)  from  some  A^j,  then  the  following  addi¬ 
tional  steps  are  performed  by  the  coordinator: 

-  determine  maxj^^ 

--  for  each  Nj  such  that  an  additional  round  of  messages  are  sent 

to  node  Nj 

*  The  coordinator  sends  to  node  Nj,  a  confirm  message  to  confirm  the 
commit. 

*  If  node  Nj  has  not  released  its  read- locks  on  a  lower  level  data  item  during 
the  prepared  state,  then  it  responds  with  a  confirmed  message,  otherwise, 
it  sends  a  not-confirmed  message  together  with 

-  At  this  point,  if  the  coordinator  receives  a  confirmed  message  from  all  Nj  to 
which  the  additional  of  messages  have  been  sent,  then  it  sends  commit  to  all 
its  subordinate  nodes. 

Otherwise  it  evaluates  and  if  >  maxl}f,  it  sends  commit 

to  all  its  subordinate  nodes,  otherwise,  it  sends  abort. 

•  The  subordinate  node  either  commits  or  aborts  the  subordinate  according  to  the 
message  received,  and  then  it  sends  an  acknowledgment  back  to  the  coordinator. 

•  After  receiving  the  acknowledgment  from  all  the  subordinates,  the  coordinator  ter¬ 
minates  Ti. 
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Theorem  if  ali  transactions  observe  degree  3  isolation,  every  DTM  uses  03SEP  (algo¬ 
rithm  3>  for  atomic  commitment,  and  every  LTM  nses  SLP  (algorithm  1 )  for  concurrency 
'ontro!  then  every  legal  history  H  consisting  of  local  and  distributed  transactions  will  be 
‘^enahza.bie 

Proof:  Ic  prove  tins  theorem,  it  is  enoiigij  to  prce'e  that  every  distributed  transaction 
I's  IF  always  tavo-nhase.  If  none  -'I's  subordinates  do  not  release  any  locks  to 
accommodate  a  rvrite  request  from  a  lower  h wel  transaction  or  subordinate,  then  7,-  obvi- 
onsb:  :s  two-phase  since  03SEP  allows  in  release  all  its  locks  only  at  the  commit  time, 
rherrfnrr  >"f.  v)ov>  need  to  prove  that  pi’cn  though  7i  reie.a.ses  some  of  its  locks  during 
it-v  eyecut-jcsn .  it  is  still  two-phase  For  'I,  to  be  not  two-phase,  at  least  one  of  T.’s  subor- 
.'iui.a,Te;-  must  acqnit'e  a  lock  after  a  subordinate  of  T,  .releases  its  S-loc!<  In  such  a  case, 
7Trn:'.'r ..According  to  the  03, SET'  protocol,  transaction  T,  will  he  aborted. 
Tin--;  guni'ani.ees  that  every  distributed  tra.nsaction  in  H  is  two-phase  Therefore,  H  is 
mrialwahle  □ 

7  r;ONCJ.USTON 

lii  tius  paper  we.  tiave  given  a  s(V'urc  loc.knig  protocol  (SLP)  and  a  secure  commit 
nrotoco!  iSEPi  for  different  degrees  of  isolation.  SLP  is  free  of  starvation,  and  SEP 
reqirres  only  4r,  messages  for  degree  0,  i.  and  2  isolation.  For  degree  3  isolation,  SLP 
ma'  super  from  starvation,  although  the  probability  of  starvation  is  quite  small,  and  SEP 
mav  sometimes  require  more  than  4n,  but  nei'er  more  than  6n  messages.  We  suggest  a 
way  to  iedr.cing  this  additional  cost  in  messages  using  synchronized  clocks. 
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ii-.  ‘lie  latp.  Jiimirier  nf  iQ93.  the  authors'  began  research  under  contract  to  the  National  Security 
-',geni':v  and  R.rme  Laboratory  to  begin  -ievelopment  of  an  informal  access  control  model  [I]  for 
i  rrusted  chie  .r  onented  database  management  .system  sODBMS!.  This  study  i,s  intended  to 
■e-  -hp  basis  tor  biture  efforts  to  produce  a  tnisted  prototype  of  an  ODBMS  offering  fea- 
:u.T-,s  .niTmarabie  ro  obose  required  for  Class  B  i  of  the  DoD’s  Trusted  Computer  .System  Evalu- 
rticn  -ntena  '^CSECi  and  the  associated  Trusted  Database  Interpretation  (TDD  of  ±e  TCSEC. 


-le  Dbuiosoonv  behind  object  oriented  technology  is  becoming  the  dz  r^u£izr  standard  for  the 
’fiustr'-',  e-on  though  there  is  presently  no  universal  mode!  that  senses  as  a  standard  for  individ- 
aa>  ODBMS  mpiemienTntions.  Several  ODBMS  products  are  currently  serving  a  growing  user 
^.ommunity  They  are  being  used  with  greater  frequency  by  the  government  and  industry  be¬ 
cause  they  otter  many  benefits  over  existing  technologies  such  as  increased  performance  for 
•  mmplex  applications,  suppon  for  unusual  data  types,  and  a  highly  flexible  data  model.  Addi- 
aonaii-',  '.vun  the  reduction  of  budgets  in  both  the  government  and  industry,  object-oriented 
‘.echnoiogy  is  gaining  a  wider  audience  tor  its  potential  to  reduce  overall  life  cycle  costs  by 
enabling  component  based  software  development,  promoting  software  re-use,  and  supporting 
extensible  solutions.  It  is  evident  that  ODBMS  technology  will  be  the  basis  for  future  DoD 
database  applications.  There  is  a  clear  need  for  a  high  integrity,  multilevel  secure,  ODBMS. 

Although  there  have  been  numerous  paper  studies,  there  are  presendy  no  worlied examples  of  a 
trusted  ODBMS,  extant  or  under  development.  It  is  equally  important  to  note  that  although  the 
more  tradiunnal  concepts  and  architecture  of  relational  DBMS  I'RDBMS)  tend  to  dominate  the 
TDl,  there  are  no  intcruretaHvnj  of  how  specific  TCSEC  requirements  are  to  be  applied  to  an 
ODBMS,  The  present  effort  is  intended  ro  suppon  future  research  and  development  needed  in 
erder  better  to  understand  a)  the  security  related  issues  m  the  design  and  implementation  and  b) 
'he  evaiuanon,  and  e.specially  the  assurancs  requirements  for  a  high-mtegrity,  multilevei  secure 
ODBMS  that  offers  B1  features. 

1  n.i.s  studv  ;,s  intended  to  take  a  fresh  look  at  the  trusted  DBMS  problem.  Previous,  relational 
.model-based  approaches,  have  largely  been  based  on  a  set  of  security  architectures  that  lead  to 

poiyinstantianon  or  .selective  database  .repiicanon  as  a  means  of  preserving  confidentiality. 
However,  use  of  this  strategy  is  often  at  the  cost  of  database  consistency,  integrity,  performance, 
and  the  abilitv  to  see  updates  without  delay.  Furtlier,  the  semantics  and  operational  conse¬ 
quences  of  polyinstantiation  have  sometimes  proven  to  be  inadequately  understood  by  users  and 
have  resulted  in  database  update  inconsistencies.  Given  that  object-oriented  architectures  invite 
the  introduction  of  new  security  architectures,  the  opportunity  is  present  to  re-examine  altema- 
nves  that  could  result  in  a  more  favorable  tradeoff  between  the  objectives  of  confidentiality  and 
database  integrity. 


'Maiv  Schaefer  was  affiliated  with  CT.A  Incorporated  at  the  time. 
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This  panel  will  explore  the  themes  and  trends  in  database  security,  including  security  policies  and  models 
as  well  as  the  user’s  perspective  and  requirements. 

As  security  for  relational  database  systems  matures,  we  see  that  the  new  commercially  available  products 
offer  hooks  for  flexible  security  policies  -  to  accommodate  application-specific  requirements.  These  hooks 
loosen  the  restrictions  on  information  flow  between  levels  in  a  controlled  manner.  For  discretionary  access 
controls,  the  analysis  of  group-based  privileges  and  the  emergence  of  new  security  paradigms,  such  as 
separation  of  powers,  also  suggests  a  trend  toward  choices  among  security  policies  through  flexible 
configuration  of  the  security  parameters.  This  raises  the  question  of  whether  there  might  be  several 
orthogonal  dimensions  that  help  to  define  a  space  of  alternative  security  policies  and  models,  and  the  extent 
to  which  these  dimensioas  can  be  made  non-interfering. 

The  need  for  assurance  and  certifiability  conflicts  with  flexible  security  policies.  One  must  determine  the 
consequences  of  each  alternative  security  policy,  and  assure  that  for  each  combination  no  operational  flaws 
or  loopholes  exist.  Thus  vendor  products  often  seek  certification  based  upon  the  one  preferred  configuration. 

Another  trend  is  the  expansion  of  multilevel  security  from  relational  systems  to  encompass  object-oriented 
systems.  Issues  of  granularity  of  classification  in  the  relational  model  led  to  decomposition  of  multilevel 
relations,  and  then  to  the  concerns  of  polyinstantiation  --  which  some  would  say  currently  includes  several 
distinguishable  cau.ses,  but  which  have  similar  operational  symptoms. 

While  the  object  model  is  more  complex  than  the  relational  model,  the  use  of  object  identity  provides  some 
control  for  polyinstantiation.  The  object  model  also  highlights  the  fact  that  there  is  interdependence  among 
the  classification  levels  assigned  to  schema  components  and  to  object  and  attribute  instances.  Perhaps  these 
classification  levels  of  the  model  should  be  subject  to  security  constraints  so  as  to  support  a  consistent 
security  policy,  as  has  been  proposed. 

The  theme  of  constraints  arises  even  more  directly  as  a  consequence  of  application  semantics.  Such 
.semantic  constraints  span  multiple  levels  and  thus  may  conflict  with  security-based  separation  of 
information.  Distributed  database  systems  demonstrate  an  analogous  conflict  between  physical  separation 
of  information  due  to  distribution  versus  similar  logical  interdependence  of  data  for  semantic  consistency. 
For  distributed  systems,  secure  design  will  add  another  dimension  of  complexity.  It  will  be  interesting  to 
see  how  the  techniques  developed  for  single  site  multilevel  security  -  especially  the  replicated  approach  - 
may  be  extend  ed  to  distributed  security. 

At  the  heart  of  such  conflicts  with  application  semantics  is  the  theme  of  data  integrity  and  consistency.  For 
example,  polyinstantiation  often  conflicts  with  application  semantics  regarding  uniqueness  of  keys  and 
consistency  of  single-valued  attributes.  The  security  field  is  ripe  for  a  renewed  inquiry  into  the  issues  of  data 
integrity  and  faithful  modeling  of  the  application.  Also,  the  rapidly  growing  distributed  information  Web  is 
providing  an  opportunity  for  security  to  support  both  commerci^  and  government  applications  in  a  new 
kind  of  information  network.  Perhaps  an  ultimate  challenge  for  security  is  whether  it  can  contribute  to  the 
safety  of  individual  systems  and  the  safety  of  composed  or  interconnected  systems. 
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Abstract 

in  a  situation  where  data  are  polyinstantiated  one  problem  which  arises  is  that  the  high  users  can  observe 
both  the  high  level  data  and  the  low  level  cover  story,  the  high  data  being  contradicted  by  the  cover  story. 
If  the  database  provides  the  high  users  with  all  these  data  without  explanation,  the  high  users  would  be 
faced  with  an  inconsistent  situation  In  this  paper,  we  propose  a  formal  mechanism  that  enables  the  global 
consistency  of  the  database  to  be  restored.  This  mechanism  is  based  on  the  merging  of  the  high  level  view  of 
the  database  with  the  lower  level  views  by  assuming  that,  when  two  contradictory  facts  exist  in  the  database, 
the  higher  sensitive  fact  is  the  most  reliable  one  and  the  lower  fact  is  a  cover  story.  However,  in  the  case  of 
a  partial  ordering  of  the  security  levels,  the  use  of  the  order  defined  on  the  security  levels  is  not  sufficient 
to  restore  the  database  consistency.  In  this  rase,  we  sugge.st  to  associate  topics  with  data  for  representing 
some  semantic  links  between  data.  Then,  we  use  topics  to  parameterize  the  order  of  the  security  levels  in 
order  to  define  a  finer  grain  of  preference  when  the  database  consistency  is  restored. 


Introduction 

Since  it.s  introduction  in  the  Seaview  Model  in  1987  [DLS''‘87],  polyinstantiation  has  generated 
a  great  deal  of  controversy.  Much  has  been  written  on  this  topic,  and  several  panels  have  been 
organized  Two  extreme  positions  can  be  identified  with  respect  to  polyinstantiation: 

1  I’ol  vinstantiation  is  an  inevitable  phenomenon  of  multilevel  data.  It  is  a  property  of  information 
and  not  of  any  specific  technology.  In  this  case,  large  numbers  of  polyinstantiated  tuples  are 
usuallv  generated  and  the  problem  is  to  investigate  how  best  to  deal  with  these  spurious  tuples. 

2  Polyinstantiation  and  integrity  are  fundamentally  incompatible.  The  results  of  polyinstantia¬ 
tion  are  unacceptable  for  an  operational  system  because  it  could  prevent  a  . job  from  being  done 
properly  In  this  case,  solutions  must  he  found  to  avoid  polyinstantiation. 

Our  view  lies  between  these  two  extreme  points  It  is  argued  in  [Bur90,  Bur91]  that  cover  stories 
are  the  only  good  reason  for  the  use  of  poly  instantiation.  We  agree  with  the  point  of  view  that 
representing  cover  stories  is  an  appropriate  use  of  and  motivation  for  poly  instantiation.  However,  it 
IS  important  to  understand  that  there  is  nothing  fundamental  about  the  occurrence  of  polyinstanti¬ 
ation  .lajodia  and  Sandhu  [JS91.  S,I91]  have  shown  how  it  is  possible  to  prevent  polyinstantiation 
in  many  situations  where  there  is  no  need  for  cover  stories.  Hence,  polyinstantiation  should  only  be 
used  where  it  is  appropriate. 
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When  this  point  is  made  clear,  several  problems  have  to  be  solved.  In  a  situation  where  data 
are  polymstantiated,  the  high  users  try  to  lie  to  the  low  users  in  order  to  cause  them  to  believe 
;-’,oiTi.°thiiig  which  is  incorrect.  The  problem  of  how  to  properly  choose  a  cover  story  is  discussed  in 
■  GL92]  tn  particular,  a  cover  story  to  be  effective  usually  requires  consistency  The  second  problem 
is  that  the  high  users  are  authorized  to  have  a  complete  view  of  the  database  They  can  observe 
both  the  high  level  data  and  the  low  level  cover  story,  the  high  data  being  contradicted  by  the  cover 
st.orv  If  the  database  provides  the  high  users  with  al!  these  data  without  explanation,  the  high  users 
would  be  faced  with  an  inconsistent  situation. 

In  this  paper,  we  propose  a  mechanism  that  allows  us  to  restore  the  global  consistency  of  the 
database  This  mechanism  is  ba.sed  on  the  merging  of  the  high  level  view  of  the  database  with  the 
lower  lew!  i-jews  by  assuming  that,  m  case  of  polyinstantiated  data,  the  higher  sensitive  datum  is 
the  most  reliable  one  and  the  lower  datum  is  a  rover  story*  However,  in  specific  situations,  the 
use  of  the  order  defined  on  the  security  levels  is  not  sufficient  to  restore  the  database  consistency. 
These  situations  appear  in  the  case  iif  a  partial  ordering  of  the  security  levels  For  the.se  situations, 
wp  propose  s.o  associate  some  topics  with  data.  Topics  allow  representation  of  some  semant  ic  links 
between  data  [CD88  GDSQ].  For  inst.ance.  we  can  associate  the  same  topic  I,ncaUza1ion  with  the 
•-.wr;  -elations  Orpariure-City  and  Arrival  f'Hy,  Wr  will  show  that  in  many  cases  li  is  actually 
noss'ble  to  identify  a  topic  with  a  set  of  information  a  ii.ser  needs  to  know  to  perform  a  particular 
loh  Then,  we  use  topics  to  parameterize  the  order  of  the  security  levels.  This  allows  us  to  define  a 
finer  order  of  preference  when  database  consistency  is  restored. 

The  remainder  of  this  paper  is  organized  a.s  follows  Section  1  reviews  the  concept  of  polyin- 
stantiation  emphasizing  those  aspects  which  are  important  to  the  objective  of  our  paper.  In  section 
2  we  show,  through  examples,  how  to  restore  database  consistency  when  polyinstantiation  is  used. 
Sector.  3  proposes  a  formal  mechanism  for  restoring  the  database  consistency  In  section  4,  we 
illustrate  how  to  u.se  this  mechanism  to  provide  answers  to  variously  classified  users.  In  section  5, 
we  compare  our  approach  with  related  work  and  section  6  concludes  the  paper  on  further  work  that 
remains  to  be  done 


1  Polyinstantiation  and  cover  stories 

,4  common  use  for  cover  stories  is  fo  hide  the  existence  of  an  otherwise  sensitive  event.  For  example, 
the  plane  FI  27  might  be  said  to  carry  food  wlien  its  artiiai  cargo  is  gas-masks  Without  a  rover 
-ton.  ati  iincla.ssified  ii.ser  who  asks  the  query  '‘Whal  is  the  cargo  carried  by  FlSl"  will  be  provided 
with  T.he  answer  “7  dov'i  knotv”  or  "You  are  nol  cleared  to  kvov-  lhaf".  In  both  rases,  the  fact  that  an 
answer  is  not  provided  may  disclose  the  existence  of  the  .secret  mission  of  F127.  In  many  situations, 
this  disclosure  may  not  be  desirable  if  a  mission  is  tn  be  successful. 

However,  notice  that,  in  many  other  situations  there  is  no  need  for  cover  stories.  For  instance, 
let  us  consider  a  database  containing  information  about  the  members  of  a  Secret  Service  and  let  us 
a.ssiim.e  t  hat  an  unclassified  user  asks  the  query  ''Give  me  the  hsi  of  spies”.  As  it  is  well-known  by 
'.'verytiody  that  information  related  to  a  Secret  .Service  is  sensitive  we  argue  that  the  best  answer  to 
this  question  ts  “You  are  not  cleared  fo  knoin  lhaf"  Hence,  a  cover  story  should  only  be  used  where 
appropriate  This  point  has  been  noticed  several  times  before  (see  for  instance  [Wis90.  SJ92]). 

i  )n  the  other  hand,  polyinstant lation  is  a  technique  introduced  by  Denning  et  al.  in  [DLS'*'87]. 

‘  It  is  important  to  notice  that  we  only  make  this  assumption  to  compare  two  contradictory  facts.  It  would  generally 
be  false  to  consider  that  the  high  level  data  are  more  reliable  than  the  lower  ones;  confidentiality  levels  are  not  used 
to  represent  the  data  reliability. 


It  was  used  as  a  technique  for  closing  a  signaling  channel  which  arises  when  an  unclassified  user 
inserts  a  tuple  that  has  the  same  primary  key  values  as  an  existing  but  higher  sensitive  tuple.  This 
initial  view  of  polyinstantiation  has  been  discussed  and  many  researchers  ([Bur90,  SJ92]  for  instance) 
r  onsider  that  these  technical  arguments  are  not  the  best  motivation  for  polyinstantiation.  We  agree 
with  this  point  of  view  and,  as  [Bur90.  SJ921.  we  consider  that: 

Claim  1  Polyinstantiation  is  actually  a  technique  that  must  only  be  used  to  support  cover  stories- 

However  Wiseman  argued  in  [Wis92]  that  polyin.stantiation  is  not  essential  for  supporting  c'^  er 
stones  and  he  showed  that  SWORD  is  perfectly  capable  of  supporting  cover  stories  without  using 
polyinstantiation  Wiseman  actually  considers  that  polyinstantiation  is  a  poor  technique  for  cover 
stories  because  il  is  difficult  to  prevent  them  arising  spuriously  His  conclusion  would  be  that 
polyinstantiation  iS  a  threat  for  the  global  integrity  of  the  database.  For  instance,  let  us  consider 
the  scenario  adapted  from  the  one  he  proposed  in  |Wis90]. 

Example  I  “Suppose  an  officer  nnshes  to  send  gas-masks  to  the  forces  at  The  Front-  He  queries  the 
database  and  discovers  that  aircraft  FIZ^  is  suitable  and  available  He  then  “hooks’’  that  aircraft  by 
rprnrdmq  'r>  the  database  that  FI  ft'  ts  carrying  gas-masks  to  the  front.  He  decides  that,  for  strategic 
reasons,  this  fact  is  Secret  (..). 

Veil!  suppose  the  system  is  also  used  by  the  .A  rmy  Catering  Corps  to  arrange  delivery  of  rations 
to  the  troops  This  activity  is  less  sensitive  than  supplying  armaments,  so  the  officer  in  charge  is 
only  cleared  to  Confidential.  Wishing  to  restock  forces  at  headquarters  with  champagne,  the  catering 
officer  queries  the  database  and  finds  that  aircraft  F127  is  suitable  and  available.  He  is  not  told  that 
it  ts  already  booked  because  he  is  only  cleared  to  (  onfidential  and  hence,  because  of  polyinstantiation 
his  query  does  not  sec  the  secret  fart  Therefore  the  officer  goes  ahead  and  arranges  for  F127  to 
carry  champagne  to  HQ.  The  database  now  contains  two  conflicting  facts'^.  (..)  The  database  is 
therefore  inconsistent  ”  □ 

If  such  a  scenario  could  arise,  then  we  would  really  be  faced  with  an  integrity  problem  because  it 
IS  not  clear  to  answer  to  the  question  “Who  vnll  win?’’  the  armament  officer  who  wants  to  carry 
gas-masks  to  the  front  or  the  catering  officer  who  wants  to  carry  champagne  to  HQ.  However,  it  is 
not  difficult  to  prevent  this  situation  from  occurring.  When  the  armament  officer  inserts  the  secret 
fact  that  FI 27  i.s  carrying  gas-masks  to  the  front,  then  there  are  two  possibilities: 

1  The  armament  officer  wants  to  hide  the  existence  of  a  secret  mission  for  FI 27.  Then,  this  officer 
himself  must  create  a  confidential  session  to  insert  a  cover  story  to  protect  the  existence  of  the 
secret  information, 

2  The  armament  officer  does  not  want  to  hide  the  existence  of  a  secret  mission  for  Ff27,  In 
this  case  this  officer  must  also  create  a  confidential  session  to  insert  that  F127  is  booked  for  a 
secret  mission.  For  this  purpose  we  can  use,  as  suggested  in  [S.191],  the  special  symbol  restricted 
whose  meaning  is  that  some  data  exists  but  is  higher  classified. 

In  both  situations,  the  catering  officer  will  know  that  FI 27  is  already  booked  Deciding  whether 
the  catering  officer  is  authorized  to  modify  the  mission  arranged  by  the  armament  officer  does 
not  depend  on  the  confidentiality  policy,  but  depends  on  an  integrity  policy  which  defines  who  is 
permitted/prohibited  to  perform  updates  in  the  database.  For  instance,  let  us  assume  that  the 
unclassified  data  (actually  a  cover  story)  “FI27  is  carrying  champagne”  is  stored  in  the  database 
and  user  A  wants  to  update  this  data  Then,  there  are  two  possibilities: 

1  The  integrity  policy  says  that  A  is  prohibited  to  modify  the  cargo  of  F127.  In  this  case,  2Ts 


^We  artiially  assume  that  FI  27  can  carry  onl.y  one  cargo  and  has  only  one  destination. 


!tpdatf  would  be  rejected  because  F127  is  already  booked.  Notice  that  this  does  not  represent 
a  signaling  channel;  j4’s  update  iS  rejected  because  of  the  existence  of  an  unclassified  data. 

t  IS  permitted  to  modify  the  cargo  of  F127  Let  us  assume  that  this  update  will  represent 
-an  effective  change  of  the  real  cargo  In  this  case,  the  unclassified  cargo  will  be  updated 
but  it  IS  likely  to  also  update  the  secret  cargo  No  one  has  proposed  this  solution  seriously 
because  deleting  the  existing  secret  cargo  would  clearly  represent  a  threat  However,  by  using 
an  integrity  policy,  we  argue  that  in  many  cases  this  solution  becomes  realistic,  we  have  only 
to  properly  define  who  is  permitted  tc  perform  the  update 

Simiiariy  Sa.ndhu  and  .lajodia  introduced  in  [SJ911  special  integrity  privileges  for  changing  restricted 
.  unrestricted  However,  it  is  not  the  purpose  of  this  paper  to  further  discuss  integrity  policies.  We 
luK  want  to.  justify  our  second  claim 

Claim  2  'n  ca,sf  nf  pnlyinstanhaied  data  iherr  must  he  only  one  person  who  controls  the  secret 
informnMori  nnd  the  associated  rover  story 

i  bis  c|anvi  allows  us  to  conclude  that,,  when  two  contr.a.dirtory  facts  exist  in  a  database  the  higher 
--■onsitivT  fa-':  IS  always  the  most  reliable  one  and  the  l(iwe>-  fact  is  a  cover  stof 

i  ontininug  ihe  above  example  Wiseman  then  wants  to  show  that  polyinstantiation  is  iinaccept- 
abh  p.ecausf  n  niighi  prevent  a  |ob  from  being  done  properly 

Example  I  ( continued)  “Suppose  that  other  ofHrrs  use  the  daiahase  to  rereire  their  orders  The 
fliah’  r~'"ir  of  I'l'f''  are  cleared  to  ronydentio-  hr'-aiis^  they  do  not  need  to  know  about  the  cargo  they 
are  transporting  fherefore  the  dalahasi  tills  ihnrt  thiy  are  going  to  HQ  because  the  fact  that  they 
should  he  going  to  the  Front  is  secret  Note  that  the  crew  are  ahovt  to  make  a  hig  mistake.”  □ 

y  ,>ossihir  solntiein  to  prevent  this  kind  of  inismatch  is  to  introduce  a  set  of  compartments  of 
information  to  create  a  finer  grain  c>f  rlassitiration  on  the  basis  of  need-to-know  For  instance,  in 
the  above  evample  we  can  create  two  compartments  Desiinatton  and  Freight  The  flight  crew 
would  be  actually  cleared  up  to  [Serrel.  Destination)  and  the  data  “The  destination  of  FiS7  is 
The  Front"  would  be  also  classified  i.  Secret  Destinaf  ion).  In  this  case,  the  flight  crew  would  be 
told  the  proper  destination  of  F127  On  the  othe.r  hand,  the  data  “The  cargo  of  F127  is  ga.s-masks” 
■voiild  be  classified  {Secret.  Freight )  and  the  flight  crew  would  not  know  about  the  cargo  they  are 
transporting  Hence  we  can  state  onr  third  claim 

<'daim  3  H  A  rv  polninstantialion  is  used,  irr  must  always  consider  the  specific  lob-related  need  to 
knov  of  users  i  (  'do  not  provide  a  rorer  story  to  users  if  Ibis  he  would  prevent  them  from  properly 
verformtng  their  loh 

if  this  third  ,  lami  is  respected,  then  we  argue  that  there  is  no  liindamental  incompatibility  between 
poivinsiantiation  and  integrity  Moreover,  according  to  this  claim,  if  two  contradictory  facts  exist 
in  a  riataba.se  and  if  a  user  need'  to  know  one  of  tho.sr  facts  to  properly  perform  hi.s  job.  then  the 
v->!;  rr'laterl  far!  i,s  the  most  reliable  one  and  the  other  fart  is  a  i-over  story. 


2  Merging  polyinstantiated  databases 

the  main  idea  of  this  paper  is  to  apply  lechmqiies  of  multi-sources  reasoning  [DDPfl2],  |('^ho92], 
|i'  :ho94b!_.  It  ho93)  in  order  to  provide  a  consistent  view  of  the.  database  when  polyinstantiation  is 
used 
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2  1  Multi-source  reasoning:  state  of  the  art 

Roughly  speaking,  multi-source  reasoning  is  a  kind  of  reasoning  with  information  provided  by  dif¬ 
ferent  sources  of  information.  This  can  be  seen  as  a  way  of  merging  several  sources  of  information  or 
databases  One  problem  which  arises  is  that  of  inconsistency;  even  if  each  source  is  self  consistent, 
the  global  set  of  information  may  be  contradictory  One  way  of  solving  this  problem  is  to  consider 
that  the  sources  are  ordered  according  to  a  total  order  which  expresses  their  relative  reliability 
IThotHa]  and  fCnfid]  show  that  it  is  actually  more  iudicious  to  consider  as  many  orders  as  topics. 
Intiiitivelv  a  topic  is  a  cluster  of  formulas  which  ‘concern”  the  same  thing.  In  this  section,  we  will 
show’  that  it  is  generally  necessary  (o  consider  topic-dependent  orders  on  the  different  databases. 
Furthermore,  the  previous  works  defined  two  attitudes  called  “suspicious”  and  “trusting”  They 
differ  in  that  when  a  source  contradicts  a  more  reliable  source,  we  can  either  globally  reject  it  or 
■VP  can  reject  only  the  contradictory  information  If  topics  are  not  used,  then  the  trusting  attitude 
'jearly  more  adequate  becaus;  (he  suspicious  attitude  w'oiild  cause  all  lower  classified  data  to 
be  Ignored  as  soon  as  a  contradiction  appears  be*ween  the  lower  classified  database  and  the  higher 
classified  database  However,  in  using  topics,  we  can  define  a  finer  order  of  preference  when  database 
'-onsistency  is  restored.  We  will  show  in  the  following  that  when  applying  multi-source  reasoning 
with  topics  to  polyiiistantiated  database  the  suspicious"  attitude  becomes  more  adequate 


2,2  Application  of  the  multi-source  reasoning  techniques  to  polyinstan- 
ciated  databases:  examples 

We  suggest  applying  the  multi-source  reasoning  technique  to  restore  database  consistency  when 
polyinstantiation  is  used.  In  a  imiltilevel  context,  each  piece  of  information  is  assigned  a  security 
level  Hence  one  can  partition  the  global  multilevel  database  into  single-level  databases^  corre- 
.sponding  to  each  security  level.  The  data  stored  in  each  single-level  database  generally  correspond 
to  a  partial  view  of  the  universe  by  users  at  the  corresponding  security  level.  Indeed,  a  user  at  a 
given  security  level  is  cleared  to  observe  all  the  single-level  databases  which  are  dominated  by  the 
user  s  clearance.  To  provide  the  user  with  a  complete  view  of  the  universe  corresponding  to  his 
clearance  level,  we  suggest  merging  all  these  single-level  databases 

In  the  ca.se  of  polyinstantiation,  the  multilevel  database  contains  some  contradictory  facts.  Our 
goal  IS  to  restore  a  consistent  view  of  the  universe  at  each  security  level.  To  achieve  this  goal,  we 
make  the  following  assumption: 

Claim  4  Farh  single-level  databosr  ?.<;  internally  consistent. 

Henc.'  d  the  multilevel  database  contains  two  contradictory  facts,  then  these  facts  would  belong 
to  tvr  different  single-level  databases  To  achieve  claim  i.  we  must  precisely  constrain  the  use  of 
polvinsta.ntiation  (see  [CY92]  for  a  detailed  discussion  ) 

If  .'laim  4  is  respected,  then  we  jiropose  merging  a  given  single-level  databa.se  with  all  the  lower 
single  level  databa.ses  If  the  security  levels  are  totally  ordered,  then  this  principle  allows  us  to 
provide  a  consistent  view  of  the  universe  at  each  security  level.  It  is  illustrated  in  the  following 
example- 

Example  1  (continued)  Let  us  eonsider  the  state  of  the  database  after  the  armament  officer  has 
stored  in  the  databa.sp  the  secret  fart  (hat  V\21  is  carrying  gas-masks  to  the  front  We  also  assume 
that  the  armament  officer  heis  inserted,  in  order  to  hide  the  secret  mission  for  F127,  the  unclassified 
fact  that  F127  is  carrying  champagne  to  HQ,  Finally,  Captain  Brown  has  been  ordered  to  be  the  pilot 


•^The  single-level  databases  are  perhaps  not  real  databases  but  only  virtual  databases 
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Figure  1  ‘  Vimplete  i/ieTi  of  i  he  multilevei  database 


of]'  i2'  This  last  faci  is  also  uiirlassihed  riiese  farts  are  respectively  inserted  into  two  single-level 
Hatahasps  denofed  fi/d^ecrei  and  DBunriai,  Isee  figure  * 

S'nr'  then  (s  no  lower  level  than  nnrlKSsified.  the  rompiete  view  of  the  universe  by  users  at 
iin-'iassified  level  is  equal  to  the  single -lev''!  tatabasp  DBii-,  .m, .  But,  to  provide  a  secret  user 
■i  itt;  a  complete  >  ->f  the  universe,  we  need  to  merge  DB<^f.,.rpA  with  DBrnfi,,?  Tn  merging 

i  jrv/i'r  127  f'rnni)  with  Dest(F'i  27  ’J Q)  a  ''ontradirtion  appears  However,  as  /tfist(F127,  Front) 
js  oi irr  sensitive  than  Dest(F\27 .  HQ),  claim  2  allows  us  to  conclude  that  Dr st{  F']‘27 .  Front)  is 
iuors  reliable  than  ne.‘ft(Fl27 .  B Q }  Hence,  a  secret  user  will  he  jirovided  with  !)est(F\27 ,  Front). 
Similarly,  after  merging  the  two  single-level  databases,  vm  ivdl  obtain  Car go(  F  \  27 ,  Gas- M asks)  in 
the  secret  view  Finally,  there  is  no  fact  in  f)Bs->!cret  which  contradicts  that 

Piloii  F  '27  f  nptain  Brown).  In  (  his  case.  ?.  serrof  usee  could  adopt  two  different  attitudes: 

1  A  suspicioup  attitude.  As  C contains  facts  which  are  contradicted  by  DBsecrety  fhe 
serre)  user  does  not  believe  iu  any  faris  stored  in  it Bonrla^ 

2  A  irusting  atiifiide.  The  secret  user  bebei'es  ■,>;  anv  [acts  of  DBunclas  which  are  not  contradicted 
by  (tBvrrrrt 

hi  the  .sequel,  we  will  discuss  which  attitude  is  the  be.st  one  in  the  case  of  poly  instantiation.  In  onr 
example,  if  a  secret  u.ser  adopts  a  trusting  (resp.  suspicious)  attitude  then  he  will  (resp.  not)  believe 
t  hat  Pilot)  F127,  ( 'aptainJirowv Figure  2  shows  the  complete  view  of  the  universe  by  unclassified 
and  secret  users,  we  respectively  denote  them  DFF'rin-iaf  and  BR^secrett  when  the  secret  users  adopt 
a  trusting  attitude.  In  this  example  it  seems  that  this  attitude  is  more  adequate  because  it  allows 
the  secret  ii.ser  to  observe  the  likely  correct  information  Pi  lot  {F' 127,  Captain^Broivn).  □ 


Best\F\27 ,  h  rnn)]  i 
Cargn{  F127 ,  Gas- M a sk.s  )  U — 
P)ln1(  P127,  Captain  .  Brown  )i 


DB< 


\ 

He st{F  127,  HQ)  \ 

'' 'argoi  F\27 ,  Champagne)  j 
Pilnl{  f  127,  Captain -Br own  j 

'v,. _ y 

DB<u  nclas 


Figure  2;  View  at  the  unclassified  and  secret  levels  {trusting  attilude) 


liowcver  a  reader  in  search  of  .simplicity  may  found  that  this  internal  merging  of  databases  is 
iiseie.ss  and  introduces  unnecessar’,  '’(sniplicat ions.  Me  m.ay  consider,  as  most  multilevel  databases 
suggest  that  ’t  is  sufficient  to  provide  a  secret  user  who  queries  the  multilevel  database  to  know 
I, he  destination  of  FI 27,  with  the  ttvo  facts. 

Drst(  F\27.  HQ).  Dest{F\27.  Front) 


6 


and  to  mention  that  the  first  fact  is  unclassified  and  the  second  one  is  secret.  Then,  he  will 
assume  that  this  secret  user  can  perform  an  “external”  merging  of  the  two  facts  and  derive  that 
f)e.st(F\27.  Front)  is  the  real  fact  and  Dest{Fl27,  HQ)  is  a  cover  story.  Unfortunately,  this  “solu¬ 
tion”  does  not  apply  to  many  situations,  especially  in  the  case  of  deductive  database.  The  following 
example  illustrates  this  problem. 

Example  2  Let  us  consider  the  following  state  of  a  multilevel  databcise  used  in  a  travel  agency: 


/  FuU(F128) 

I  h..Pa.<<s(’ngers{F12S,  200) 


DB  Secret  7)  Bonclas 

Figure  3  ( Complete  view  of  the  multilevel  database 

*' !>  assume  i.hal.  in  this  travel  agency,  some  seats  are  kept  free  for  secret  users.  This  is  the 
eason  why.  the  database  stores  the  uncla-ssified  fact  that  FI  28  is  full  even  though  some  seats  are 
still  available  for  secret  users.  If  a  secret  u.ser  queries  the  database  to  know  if  F128  is  full,  then  the 
database  management  system  (DBMSl  will  answer  yes  and  mention  that  this  fact  is  unclassified. 
Notice  that  if  we  do  not  merge  the  secret  database  with  some  unclassified  facts,  then  the  DBMS 
."annot  tell  this  secret  user  that  F128  is  actually  not  full  and  mention  that  this  fact  is  secret.  Hence, 
the  DBMS  only  provides  the  secret  ii.ser  with  the  unclassified  cover  story!  In  using  the  approach 
we  suggest  in  this  paper,  the  complete  view  of  the  universe  at  the  secret  level  is  represented  in  the 
following  database. 


Capacity{Fl2S,  250) 

V/,  c,  Capaettyi  f,  c)  —>■ 

{Full(f)  Nb-Passengers{f,c))  J 


blb^PassengeTsi  F128,  200), 
Capnrttyi  F12R,  250) 

V/  ( .  C aparttyi  f.  c ) 

{Fullif)  Nh.PassengeTsi  f.r)) 


Figure  •)  >  omplete  view  at  the  secret  level 


'■''w  if  a  s'ecref  user  queries  the  rial  abase  to  know  if  F128  is  full,  then  the  DBMS  also  derive  the 
answer  ne  and  mention  that  thi.s  fact  is  secret  L) 

1  nfortunatelv.  in  case  of  a  partial  ordering  of  the  security  levels,  it  is  not  sufficient  to  use  the 
inler  defined  on  the  security  levels  to  restore  the  databa.se  consistency.  The  following  example 
illustrates  this  problem, 

Exemple  I  (Continued)  Let  us  assume  that  t  he  flight  crew  of  F127  use  the  database  to  receive 
their  orders  To  allow  the  flight  crew  lo  know  about  the  destination  but  not  about  the  cargo,  we 
have  created  a  compartment  Desfinalion  The  flight  crew  are  cleared  up  to  {Secret,  Destination) 
and  the  data  Dest{F]27,  Front)  is  actually  cla.ssified  at  {Secret,  Destination)  to  allow  the  flight 
crew  to  know  about  the  proper  destination  of  P'127 

Similarly,  let  us  assume  that  the  ground  crew  use  the  database  to  know'  what  cargo  to  load. 
We  assume  that  they  do  not  need  to  know  the  destination.  Hence,  we  create  another  compart- 


:  Dest, Freight  I  :  i 


I'i  Ij 


[Jfst{F127,  f  rnrif 
■  irgo(Fi27.  /no-/' 


Cargoi  F]27 ,  Gas-M arks)  j  D Bi^srrTfi.Fretnh; 


i  i>(  sf,  k  1.:',  HO  : 

DBunrt,.  i  i  argni  i'  I  .i  : .  (  hrmnngrtf  ' 
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Figure  5  (  umplete  view  of  the  multilevel  database 

inent  Frrjght  the  ground  crew  arc  cleared  up  to  i Secret  Freight)  and  the  data  Cargo{F\21  .Gas- 
\4askr)  IS  classified  at  (Secret,  Freight) 

■'A---  also  assume  that  the  following  unclassified  integrity  constraint  is  stored  in  the  database: 
'ix.C  eirgnl  r  (  harnpngne  )  --o  Deftfix-HQ) 

I  his  constraint  says  that  champagne  mav  only  he  sent  to  HQ  l,et  us  now  consider  the  flight  crew’s 
■leiv  of  the  universe  They  can  observe  the  exact  destination  of  F127  -  Dest(  F 127 ,  Front)  and 
the  unclassified  cargo  of  F127  (  nrgnt  I  \27 .  <  'hainpagnr).  However,  this  view  is  contradicted  by 

the  abov<>  integrity  constraint.  By  using  claim  2.  the  flight  crew  know  that  nrst{F]27.  Front)  is 
'he  proper  destination  of  F127-  So,  the  flighi  crew  will  derive  that  Cargo) F]27 ,  Champagne)  is 
a  rover  storv  \  possible  solution  to  prevent  this  kmd  of  disclosure  is  to  introduce  a  second  cover 
str.r-  for  the  flight  crevc,  for  instance  (  'argni  !‘  \27.  Pond)  On  the  other  hand,  the  ground  crew  can 
observe  that  Desti  F\'27 ,  HQ)  /\  (  argni  I  127  Gas-M  aeke)  [n  this  case,  we  do  not  have  to  change 
the  'over  story  Itest{F)27 ,  HQ)  because  we  consiiler  that  it  is  possible  to  restock  the  headquarter 
with  gas-masks.  Figure  5  sums  uii  the  complete-  \  lew  of  the  multilevel  database 

in  using  a  similar  approach  as  the  one  used  in  the  previous  example,  it  is  easy  to  build  the 
ompletf  'neiv  the  universe  by  users  at  the  [Seerei.  DeMinntion)  and  (Secret,  Freight)  levels. 
But  to  provide  a  user  cleared  up  to  f.Sferc- ,  {  7)est,  Freight])  wuth  a  complete  view  of  the  universe, 
we  need  to  merge  OR(Secret.De,t^  with  DB.secrrtj  rrir.ht  -  In  merging  C argo(F  127 ,  Food)  with 


S 


Cargo[F\21  .  Gas-Masks),  a  contradiction  appears.  As  we  cannot  compare  {Secret,  Destination) 
and  (Secret  ,  Freight),  a  user  cleared  at  (Serret,  {Dest,  Freight})  cannot  use  the  order  of  the  secu¬ 
rity  levels  to  restore  the  database  consistency  However,  a  user  cleared  at  (Secret,  {Dest,  Freight}) 
knows  t.hat  users  cleared  at  (Secret,  Freight)  need  to  know  the  correct  cargo  to  properly  perform 
their  job  Hence,  a  user  cleared  at  (Secret,  {Dest.  Freight})  can  use  this  knowledge  to  deduce  that 
Cargo(F\21  .Gas-Masks)  is  the  most  reliable  fact  and,  therefore,  C argo(F  127 ,  Food)  is  a  cover 
story. 

We  suggest  introducing  the  concept  of  topic  [CD89]  to  model  the  deduction  the  user  at 
{Secret,  {Dest ,  Freight})  has  performed  to  derive  Cargo(F\27 ,  Gas-Masks)  Data  with  similar 
semantics  are  linked  by  the  same  topic.  In  our  example,  we  may  introduce  three  topics:  Dest, 
Freight  and  Creu^  We  respectively  associate  the  topic  De.st  with  the  data  Dest(F\27 ,  Front) 
and  Dest(  F]27  HQ)  the  topic  Freight  vnth  Cargo(F\27 , Gas-Masks),  Cargo(Fl27 .  Food)  and 
(  nr go{F\27  Champagne),  and  the  topic  Crete  with  PUot(F\27  Captain J3r own) 

In  order  to  provide  secret  users  with  a  consistent  view  of  the  database,  we  use  topics  to  param¬ 
eterize  the  order  of  the  security  levels  and  to  merge  data  with  this  finer  grain  of  preference  For 
instance  we  define  the  following  total  order  of  preference®  for  merging  information  related  to  the 
topi-r  Dest 

(Secret,  {Dest,  Freight})  >oe!t  (Secret,  De.st)  >[)est  (Secret,  Freight)  >Dest  Unclas 

In  particular,  this  order  means  that  information  related  to  the  topic  Dest  are  more  reliable  at  level 
(Secret,  Dest)  than  at  (Secret,  Freight).  Similarly,  we  define  the  following  total  order  of  preference 
for  the  topic  Freight . 

(Secret,  {De.st.  Freight})  >Freighi  (Secret ,  Freight)  > Freight  (Secret,  Dest)  >Freight  Unclas 

Notice  that  this  order  differs  from  the  one  defined  for  the  topic  Dest  because,  according  to  the 
specific  need  to  know  of  users  at  level  (Secret,  Freight),  information  related  to  topic  Freight  are 
more  reliable  at,  level  (Secret,  Freight)  than  at  (Secret,  Dest).  Finally,  for  the  topic  Crew. 

(Secret,  {Dest,  F'reight})  >crew  (Serret,  De.st)  >rrew  (Secret,  Freight)  >crew  Unclas 

These  orders  of  preference  are  used  to  restore  databa.se  consistency  at  the  (Secret.  {De.st,  Freight}) 
level  Figure  (next  page)  shows  the  resulting  view  of  the  database  at  the  different  security  levels. 

We  have  also  pointed  out  that  a  user  at  (Secret.  (De.st,  Freight})  can  adopt  two  different  atti¬ 
tudes:  a  suspicious  attitude  or  a  trusting  attitude  Which  attitude  is  the  best  in  case  of  polyinstan- 
tiation"  If  we  do  not  use  topics,  then  we  have  already  noticed  that  the  trusting  attitude  is  probably 
more  adequate  However,  we  think  that  when  using  topics,  the  best  attitude  is  the  suspicious  one. 
Indeed,  let  us  consider  the  following  example:  at  the  unclassified  level,  we  insert  that  the  champagne 
carried  by  FI 27  is  a  Veuve  Cliquot  of  1981.  (  dearly,  we  do  not  want  to  provide  the  secret  user  with 
the  fact  that  Fr27  is  actually  carrying  ga,s-iriasks  called  Veuve  Cliquot  and  dated  from  1981.  So, 
if  a  cont radiction  appears  between  the  lower  information  related  to  a  given  topic  and  the  higher 
information  related  to  the  same  topic  then  the  best  attitude  is  to  reject  all  the  lower  information 
related  to  this  topic  because  these  information  are  probably  related  to  the  same  cover  story  -  namely 
that  F127  is  carrying  champagne  in  our  example  This  attitude  does  not  preclude  the  high  user  from 
observing  lower  information  related  to  another  topic,  for  instance  Crew,  and  thus  knowing  that  the 
Pilot  of  F 127  is  Captain  Brown,  Hence,  in  separately  merging  information  related  to  the  same  topic, 
we  argue  that  the  best  attitude  is  the  suspicious  one 

^  The  sets  of  compartments  and  topics  may  generally  overlap 

^Actually,  this  order  would  be  defined  by  the  database  security  administrator 
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Figure  fi.  View  ai  the  different  security  levels 

Moreover,  tor  the  readers  who  might  prefer  the  trusting  attitude,  it  is  important  to  notice  that 
the  database  administrator  who  is  in  charge  of  defining  topics  may  choose  a  very  general  or  a  more 
specific  representation  In  particular,  if  he  decides  to  a-ssociates  any  pair  of  literals  (1,  —  1)  with  a  a 
topic  then  the  suspicious  attitude  and  the  trusting  attitude  collapse.  Hence,  the  trusting  attitude 
may  be  seen  as  a  particular  case  of  the  suspicious  attitude  □ 

in  the  next  section  we  propose  a  formal  modid  for  restoring  the  multilevel  database  consistency 
v  h'ch  mcludes  topics  and  adopts  a  suspicious  attitude 


3  The  formal  model 


3  (  Assumptions  of  our  model 

»  Soeurity  levels 

i^et  us  first  a.ssume  the  e.xisteiice  of  a  set  of  security  levels  w'hich  are  ordered  according  to  a 
strict  partial  order,  noted  > 

Ifivrlt  >  irnrio  means  that  people  cleared  up  to  Irvrli  may  access  information  which  belong  to 
levrlx  or  to  Iri-eF .  People  cleared  up  to  Irxwln  arc  not  allowed  to  access  information  at  level\. 
For  instance  the  four  levels:  .serre/i  2,  serref\ ,  serretn.  unclns  may  be  partially  ordered  in  the 


following  way:  secreti^^  >  secreti  >  unclas,  and  secre<i,2  >  secret^  >  unclas. 

•  The  language  and  the  approach. 

We  assiime  that  the  language  used  to  describe  the  databases  is  propositional.  Notice  that,  even 
if  apparently  a  first  order  language  is  needed  (tuples  of  relations)  we  can  associate  it  with  a 
propositional  one  because  of  the  domain  closure  (i.e.  we  assume  that  there  is  a  finite  number 
of  objects)  [Rei78]. 

The  databases  which  we  consider  here  are  sets  of  propositional  literals  and  we  adopt  a  model- 
theoretic  approach,  i.e.  each  database  is  associated  with  its  logical  models.  Notice  that  this 
restriction  to  literals  means  we  cannot  represent  rules  in  the  database,  such  as  those  in  example 
2  This  clearly  represents  some  further  work  that  remains  to  be  done. 

•  Topics 

We  assume  that  the  underlying  propositional  language  is  partitioned  in  several  pairwise  disjoint 
subsets  called  “topics”.  In  this  paper,  we  do  not  consider  the  case  of  topics  structured  with  an 
ISA  relation:  we  only  consider  that  topics  form  a  partition  of  the  language.  The  only  condition 
we  impose  on  topics  is  the  following  one.  Let  /  be  a  topic,  let  /  be  a  formula: 

if  G  #)  <t=t>  (-/  G  t) 

This  condition  says  that  the  formula  /  belongs  to  the  topic  t  if  and  only  if  the  negated  formula 
-1  f  belongs  to  t. 

Each  topic  is  supposed  to  be  a  .set  of  formulas  which  “concern”  the  same  thing.  For  instance 
the  following  two  formulas:  “the  destination  of  F127  is  the  front”  and  “the  destination  of  F127 
is  the  head-quarter”  belong  to  the  same  topic  “destination-of-F127”.  But  “the  pilot  of  F127 
is  Captain  Brown”  does  not  belong  to  this  topic.  Notice  however  that  the  definition  of  topics 
depends  on  the  context.  Indeed,  if  it  was  necessary,  we  could  have  considered  only  one  topic 
“F-127”  which  could  have  grouped  the  three  previous  formulas. 

•  Topic-dependent  orders. 

As  said  previously,  each  security  level  is  associated  with  a  (possibly  virtual)  database.  For 
instance,  in  the  above  example,  there  would  be  four  databases,  respectively  accessed  by  the 
people  cleared  up  to  secret]  2.  secret],  spcrct2,  and  unclas 

The  existence  of  cover-stories  make  these  databases  apparently  inconsistent.  The  solution  we 
suggest  is  to  allow  the  database  security  administrator  to  express  topic-dependent  orders  on 
security  levels 

Let  t  be  a  topic,  we  note  >t  a  total  order  of  levels  which  is  associated  with  t. 

Definition  1  f'ompatibility  heUveeu  security  levels  and  topic-dependent  orders. 

I,et  >  be  the  partial  order  defined  on  the  security  levels.  Let  >t  be  a  topic-dependent  total 
order  They  are  compatible  iff  for  all  levels  l\  and  l->  we  have;  (/j  >  I2)  (h  h)- 

In  other  words,  >  and  >(  are  compatible  iff  >;  is  a  total  extension  of  >. 

Claim  5  We  assume  that  the  different  topic-dependent  orders  >(  are  compatible  with  >. 

3.2  Semantics  of  the  suspicious  fusion  with  topic-dependent  orders 

f  dnsider  n  databases  to  be  combined.  Let  us  note  L  the  underlying  propositional  language  associated 
with  TT7  topics  <].  ...tm.  The  individual  databases  are  finite  sets  of  literals  of  L  which  are  satisfiable 
(consistent)  but  not  necessarily  complete  (a  base  B  is  not  complete  if  there  is  at  least  one  literal  1 
such  that  1  ^  B  and  -i  1  ^  B). 
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in  ^his  section,  we  give  the  semantics  of  a  logic,  called  FUSION-S  whose  language  L’  is  obtained 
from  /  by  adding  pseudo-modalities  i.e.,  marks  on  propositional  formulas.  These  pseudo-modalities 

a  rt ' 

•  [f  b  Omj  i  where  the  O;  arc  total  orders  on  a  given  subset  of  {1, n)  which  are  tj-dependent 
and  compatible  with  >. 

bet  O;  .Om  be  m  total  orders  on  k  databases.  Our  goal  in  this  section  is  to  define  a  semantics 
tor  [O,  ,  .  Intuitively,  [01../)^^]^’  means  that  F  is  true  in  the  database  obtained  by  vi:  tually 

merging  the  k  databases  according  to  the  m  i,;-dependent  orders  0\...0m  The  satisfiability  of 
iO,  Om\F  is  defined  m  definition  5 

Remark  Notice  that  the  general  form  of  these  pseudo-modalities  allows  us  to  represent  the  par¬ 
ticular  rase  where  k  =  1.  In  thi.e  ca.se.  iJiere  is  only  one  database  ii  to  be  ordered. 

So.  O-i  ~  O2  -  .  =  Om  =  (2)  1  And  .  ?ijF  will  mean  that  F  is  true  in  database  ii. 

By  convention.  [?u...ii]  will  be  noted  [?  1  i 

Definition  2  Let  m  be  an  interpretation^  of  L  and  f  a  topic  of  I .  We  define; 

m  I  /={/:/  g  m.  and  I  £  (} 

rr,  I  !s  t.lie  proiert  'on  of  the  content  of  m  on  i  be  t  opic  1. 

Definition  3  bei.  k  be  a  set  of  mterpretanons  rif  /,  and  f  a  topic  of  L.  We  define: 

/f  I  t  —  1 7?.  I  f  .  771.  e  } 

Definition  4  l.et  t  be  a  topic  and  O,  be  a  total  order  (ii  >  ..  >  iii)  on  k  databases.  We  defin"  ; 

/?f'f /J(7i ))...),  where:  h,,^t(F)—R{ij)\t  f]  E\t  if  not  empty 

h,^AE)^F\t  else 


Definition  5  The  unique  model  of  FUSKVN  is  the  pair  M  =  (W.'R.)  where: 

»  H  0;  the  finite  set  of  all  the  ml erpretaf  10ns  of  /, 

•  S'  IS  a  Hnite  .set  of  subsets  of  H  siicli  that  any  mo'laiitv  [O;  ,,0m]  is  associated  with  such  a 
subset  notf'd  R(Oi...Om)  Thi'se  subsets  are  defined  by  . 

R{}.  i)  IS  the  set  of  models  of  dabatase  7  We  note  it  R(i). 

/?(();.  0,1,)  =-  {w  :  w  i/'i  IJ  I  i  7)777.  where 

Vi  [1  .77i].  7/'-  £  Ri(0,  }  and 
V/  £  LJ  ^  w  rr  --i/  ^  w  ] 


We  ran  prove  that 

Proposition  1  RiO^  Om)  is  never  empty 

This  means  that,  although  the  ba.ses  are  contradictory  (becainse  of  the  cover  stories),  the  combined 
ba.se  IS  not  contradictory  (i.e  the  information  provided  at  any  level  will  not  be  contradictory). 

We  assimilate  an  interpretation  of  L  with  a  set  of  literals. 
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Definition  6  (Salisfaciion  of  formulas). 

Let  F  be  a  formula  of  L.  Let  Fi  and  Fj  be  formulas  of  L'.  Let  Oi-Om  be  total  t, -dependent  orders 
on  a  subset  of  {1,  Let  M  =  {W,Tl)  be  the  unique  model  of  FUSION  and  let  w  e  W . 


FUSION, H)  1=  F 
FUSION.u)  h  [Oi.,.0„]F  <t=> 
FUSION,?/;  1=  -iF  <=> 

FUSION.?//  \=  F1AF2 


IV  ^  F 

'iw',  w'  G  R(Oi  ...Om)  w'  \=  F 

(FUSION,  w^F) 

(FUSION,®  1=  Fi)  and  (FUSION,®  \=  F2) 


We  note  M  |=  F.  iff  V®  G  W,  FUSION.  ®  |=  F. 

We  are  interested  in  finding  formulas  of  the  form:  [Ot  .  Om]F  which  are  satisfied  in  the  model 
M.  i  .e.  finding  formulas  F  which  are  true  in  the  database  obtained  by  merging  the  databases,  when 
the  order  is  Oi-.  Om 

3.3  Model  application 

Example  Let  us  consider  again  the  previous  example.  Let  L  be  the  propositional  language  whose 
propositions  are;  front,  HQ,  food,  gas-masks,  champagne,  captainbrown.  Let  us  define  three  topics: 
Dest  =  {front,  HQ}.  Freight  =  {food,  gas-masks,  champagne}  and  Crew  =  {captainbrown}.  We 
consider  the  three  topic-dependent  orders  :  >Dest,> Freight, >Crew  defined  by  : 

Or,,,t  :  {Se.rret .  {Dest,  Freight})  >oe,t  {Secret,  Dest)  >De,t  (Secret,  Freight)  >Dest  Unclass, 

^Freight  - 

(Secret,  {Dest,  Freight})  >Freight  (Secret,  Freight)  >Freight  (Secret,  Dest)  >Freight  Unclass, 
Ocrew  ■■  (Secret.  {Dest,  Freight})  >crew  (Secret,  Dest)  >crew  (Secret,  Freight)  >crew  Unclass, 
Let  us  compute  R(ODest,OFTeight,Ocrew)  in  the  model  M. 

•  R(DB(srcret.{De,t, Freight})) all  the  models  of  L  which  satisfy  the  integrity  constraints  express¬ 
ing  that  a  plane  can  carry  only  one  freight  and  has  only  one  destination. 

•  R{DB(serret.De3t))  =  {  (front,  food.  HQ,  -1  gas-masks,  -■  champagne,  captainbrown), 

(front,  food.  -  HQ,  -■  gas-masks,  -i  champagne,  ->  captainbrown)  } 

•  R(DB,srrrrt  Freight))  =  {  (front,  gas-masks,  -1  HQ,  food,  -1  champagne,  captainbrown  } 

(HQ,  gas-masks,  front,  food,  -1  champagne,  captainbrown) 
(front,  gas-masks.  HQ,  ~  food,  -i  champagne,  captainbrown) 
(HQ,  gas-masks,  -i  front,  ->  food,  champagne,  -1  captainbrown)  } 

•  R(DBi,„eiass)  =  {  (HQ,  -■  front,  champagne,  ->  gas-masks,  ->  food,  captainbrown)  } 

Let  us  now  compute  R[)est(Dpe,,t),  RFreighttDFrdght),  Rc  rew  (Oc  rew\ 

•  Rne,t(One3t)  -  {  front,  -■  HQ  } 

•  RFreight(OFreight)  =  {  gas-masks,  -n  food,  -o  champagne  } 

•  Rcrew(Ocrew)  -■  {  Captainbrown  } 

Thus,  by  definition  5 


•  R(OD>^st,OFreighi,Ocrew)  =  -j  front,  -t  HQ,  gas-masks,  -I  food,  champagne,  captainbrown  }. 
Sr,  M  1=  iPoestOFreightOcrew]  (front  A  ga.s-mask.s  A  captainbrown), 

1  e  the  formula  (front  A  gas-masks  A  captainbrown)  is  true  in  the  base  obtained  by  merging  all 
the  bases  which  are  under  DB(^Sscre.t-iDesi. Freight})  other  terms,  a  person  who  is  cleared  up  to 
i Secret.  {Dest,  Freight})  will  know  that  “Captain  Brown  pilots  the  F-127  to  the  front  with  a  cargo 
cf  (jns-  masks" . 

4  Answering  queries 

in  this  section,  we  show  how  to  answer  queries  addressed  to  the  database  which  is  composed  of  several 
dat.aPases  att  ached  to  different  security  levels  This  (jiiery  evaluator  is  based  on  the  semantics  given 
in  the  previous  section. 

First  of  all,  we  need  to  introduce  the  following  definitions- 

Definition  T  Let  DBj^...DBt^  be  n  databases  associated  with  n  security  levels.  Let  O;  he  a  total 
order  on  these  databases  which  is  dependent  on  a  topic  t  l,et  us  note  Ot  =  {  h  >t  h  >t  ■■■  >t  in  }■ 
l,pt  <  be  a  security  level.  We  define  0.1/  the  re-stnction  of  0;  to  the  set  of  databases  DBi,  such  that 


E.x.amplo  Let  us  come  back  to  the  example  introduced  in  section  .3,3  and  consider  the  security  level 
{Ficrnci.  P'reight)  For  every  topic  t  G  {Dfst.  Fv'^igh! .  Crew} .  we  have: 

Oi|(S'ecret.  Freight)  —  {(Secret.  Freight)  >,  Unclass} 

I.ef,  us  consider  a  person  who  is  cleared  up  to  level  1  asking  a  query  Q  (which  is  a  formula  of  L). 
We  define  tlie  answer  of  Q  provided  for  persons  cleared  up  to  /  as  follows: 

Definition  8  I'A  nswrr  to  Q  provided  to  persons  cleared  np  to  1} 

Let  r)/?(  DBi^  be  n  databases  a.ssociated  with  n  security  levels.  Let  Ot,U  =  I,.  ..m,  be  total 
orders  on  these  databa.ses  which  are  dependent  on  topic  /],  Let  Q  be  a  formula  of  L. 


\\Q\\!  - 

TRIE 

iff 

M  L-  [o,,!/.,.0(j/]g 

ilQiii  - 

FAi.SE 

iff 

M  fc  \0^^\l  .X),jiu 

!IQ!|i 

•y 

iff 

else 

Example  ;  I'he  following  array  ii.sts  some  questions  .and  the  answers  provided  according  to  the 
clearance  level  of  the  person  who  asks  it.  Notice  that  we  give  the  queries  in  the  first  order  language 
from  which  /  is  based 


1 

Hahihtation 

Queries 

Where  does  F  !‘27  go'i’ 

What  does  FI 27  carry? 

W'ho  is  the  pilot  of  F127? 

j  [ Si'cret , { Dest .  Freight } ) 

Front 

Ci  as- Masks 

Captain  Brown 

1  (Secret  .Dest ) 

Front 

Food 

Captain  Brown 

!  (Secret  .Freight) 

HQ 

Gas-Masks 

Captain  Brown 

1  llnrla.s 

HQ 

Champagne 

Captain  Brown 

5  Comparison  with  related  work 


There  exists  some  connection  between  our  approach  and  the  Nonmonotonic  Typed  Multilevel  Logic 
(NTML)  developed  by  'Hiuraisi'ngham  [Thu91].  In  NTML,  the  primitive  symbols  such  as  constant, 
variable  or  predicate  name  are  associated  with  security  levels.  Hence,  it  would  be  possible  to 
consider  that  the  fact  Cargo{F\21  .Gas-Masks)  is  secret  because  Gas-Masks  is  a  secret  constant. 
In  our  approach,  we  consider  that  the  logical  language  used  to  model  the  multilevel  database  is 
not  protected.  We  agree  that  hiding  some  part  of  this  language  allows  us  to  represent  additional 
ways  of  protection  We  consider  that  this  problem  represents  further  work  that  remains  to  be  done. 
However,  it  is  not  essential  to  achieve  our  goals  in  this  paper. 

On  the  other  hand,  we  feel  that  it  is  a  conceptual  simplification  to  represent  a  multilevel  database 
as  a  set  of  single-level  databases  instead  of  a  single  multilevel  theory  as  in  NTML.  We  also  prefer 
representing  each  single-level  database  as  a  set  of  models  instead  of  as  a  theory.  Moreover,  ir. 
many  .specific  situations  it  is  not  clear  how  to  restore  the  database  consistency  using  NTML.  These 
•^ltuations  include  the  case  of  a  partial  ordering  of  the  security  levels  but  are  not  restricted  to  this 
'  ase  For  instance,  as  noticed  in  [r;LQS92],  if  P  and  Q  are  facts  at  the  unclassified  level  and  ^(PAQ) 
a  new  fact  at  the  secret  level,  it  is  not  clear  how  to  choose  which  of  P  and  Q  is  not  inherited  from 
unclassified  level  to  secret  level  to  avoid  a  contradiction  In  our  approach,  we  propose  to  use  topics 
to  define  a  finer  grain  of  preference  which  allows  to  restore  the  database  consistency  even  though  the 
security  levels  are  partially  ordered.  Finally,  information  related  to  the  same  topic  are  separately 
merged  We  argue  that  in  this  case  the  best  attitude  is  a  suspicious  attitude  instead  of  the  trusting 
attitude  used  in  NTML.  Hence,  in  the  above  example,  if  P  and  Q  are  facts  at  the  unclassified  level 
and  related  to  the  same  topic,  -i(  P  A  Q)  a  new  fact  at  the  secret  level,  and  the  secret  user  adopts  a 
suspicious  attitude,  then  this  secret  user  would  reject  both  P  and  Q. 


6  Conclusion 

Situations  exist  where  we  need  to  hide  the  existence  of  an  otherwise  sensitive  event.  In  these 
particular  situations,  we  generally  need  to  use  cover  stories.  Hence,  cover  stories  are  a  fundamental 
multilevel  requirement.  On  the  other  hand,  we  agree  with  [Bur91]  that  polyinstantiation  is  not  a 
fundamental  property  of  multilevel  databases:  it  is  simply  a  powerful  technique  for  supporting  cover 
stories.  Moreover,  in  using  poly  instantiation,  several  problems  have  to  be  solved.  This  paper  aims  to 
solve  one  of  these  problems,  namely  how  to  restore  the  database  consistency  when  polyinstantiation 
IS  used.  We  have  proposed  a  formal  mechanism  w’hich  works  even  though  the  security  levels  are 
partially  ordered  Further  refinements  of  our  approach  are  possible.  A  first  refinement  would  be  to 
extend  our  model  to  include  the  possibility  to  deal  with  rules  in  the  database.  This  extension  would 
allow  us  to  treat  the  example  2  in  our  model 

We  have  also  suggested  that  it  could  be  interesting  to  hide  some  parts  of  the  database  schema 
in  order  to  represent  additional  ways  of  protection.  Another  refinement  would  be  to  include  the 
rase  where  we  do  not  need  polyinstantitation,  i.e.  we  do  not  want  to  hide  the  existence  of  a  secret 
event  For  this  purpose,  we  can  use  the  special  symbol  restricted  introduced  in  [SJ91].  We  do  not 
feel  that  it  would  generate  any  problem  because,  in  this  case,  the  high  view  is  not  contradicted  by 
the  lower  view.  A  third  possibility  would  be  to  deal  with  content-dependent  rules.  For  instance,  we 
may  introduce  a  rule  saying  that  “The  destmatiov  of  F127  ts  always  secret  information”.  All  these 
refinements  would  allow  to  have  a  complete  representation  of  the  different  ways  of  protection. 

Our  approach  is  designed  to  provide  the  high  level  user  with  some  parts  of  the  low  level  database 
which  are  not  in  contradiction  with  the  high  level  database.  It  would  also  be  interesting  to  extend 
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t  hi*!  approach  so  that  it  would  be  possible  for  the  high  level  user  to  see  what  the  lower  users  see  and 
know-  if  they  are  believing  some  rover  stories.  As  our  approach  is  designed  to  recognize  if  a  given 
data  is  a  rover  story,  we  guess  thp<  ’  would  probably  be  easy  to  include  this  extension  in  our  model, 

.4nother  problem  not  discussed  in  this  paper  is  how  to  properly  choose  a  cover  story.  P'or  instance, 
in  figure  3,  we  have  rejected  Dest(F^  27,  Fr(mt)/\Cargo(F\21 ,  Champagne)  because  this  view  would 
iif  contradicted  by  an  integrity  istraint.  Hence,  a  cove  itory  to  be  effective  requires  consistency, 
(in  the  other  hand,  we  have  ac( opted  the  view  Dest(/''27,  HQ)  A  Cargo{F]27, Gas-Masks)  for 
isers  “leared  up  to  {Secret,  FreAghi)  because  v/e  have  onsidered  that  this  view  may  be  plausible. 
Knowing  if  users  at  [Secret,  Freight)  would  really  belh  in  this  cover  story  is  a  difficult  problem 
lisfiissed  in  |Gb92]  We  fee!  that  combining  the  solutii  o  this  last  problem  with  the  mechanism 
developed  in  this  paper  would  be  an  important  step  towa.  ’  a  meaningful  semantics  for  cover  s.ories 
and  therefore  polyinstantiation. 
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Abstract 

!  lii'-  iruier  presenis  a  new  api)roarh  to  the  dellnilion  and  the  enforcement  of  confiden 
tiaht',  in  iopu  dalabases.  Wt'  investigate  the  sf'inanties  of  a  logic  database  consequent 
iijion  tin  introduction  of  users  and  nghts,  Ihegarding  a  database  with  rights  as  a  proper 
exP  aision  ol  an  >>pen  database,  we  define  th<'  notion  of  global  validity  and  that  of  a  per 
sonal  flatabase  profile.  We  give'  loni'  lormal  definitions  of  confidentiality  and  show  that 
iinc  e  .il  them  art'  meaningful  in  the  preseiK c  of  the  Closed  World  Assiniiption.  Exter- 
ici!  issiimjitions  regarding  tin'  alUx'ation  of  rights  and  the  individual  user  knowledge 
rennd  oji  ihe  formalism,  'fhereafter  we  foeus  onr  attention  on  the  enforcement  of  the 
eimfidentialitv  iorm  (11,  whu'b  allows  a  user  to  oossess  indefinite'  information  on  se- 
ere-ie  j  Ifis  form,  being  th<'  lowest  iiossible  k-ve!  ol  conlidt'nliality,  has  the  advantage 
dial  the  datafias<  never  li('s  to  a  user,  i-  it  can  he  uk'I  without  a  covt'r  store.  Wt'  con- 
ati  'v  !  he  t'nlorcemenl  of  Cl  with  respt'ei  !(>  the  data  and  the  integrity  constraints.  The 
pi  esenit'd  formalism  is  tht'ort'ticalb  sound,  comiik'U'ly  embodied  in  standard  predicate 
logit  and  c'xtt'ndible  to  a  multilevel  socurilr  model. 

1  Introduction 

in  tins  section  we  give  an  informal  definition  of  an  open  logic  database  (LDB)  and  that 
ol  a  st't  lire  LUB,  viz  a  LDB  which  is  expected  to  keep  some  confidential  information 
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src’ret  from  particular  users  Thereafter,  we  discuss  previous  works  and  related  aje 
pi oachcs.  In  this  paper  we  put  the  emphasis  on  a  clear  illustration  of  our  ideas,  thus  we 
omit  some  formal  details. 

1 . 1  Overview 

A  Stale  oi  the  world  as  seen  by  a  1  DR  consists  of  facts,  rules  and  general  laws.  The 
1  I )[ i  niod('l  maps  the  facts  and  rules  of  a  state  of  the  world  into  a  set  of  data  and  the 
gi  nera!  laws  into  a  set  of  sialic  inlegrily  constraints.  A  LDB  uses  normal  clauses  for  the 
unilorrn  ir'iiresentation  of  facis.  rule's,  integrity  constraints  and  queries.  The  application- 
dependf'ut  symbols  which  may  occur  in  a  clause  are  contained  in  the  database's  signa¬ 
ture.  The  database-languag<'  comprist's  all  clauses  which  the  database  understands,  viz 
i'ecognises.  A  user  can  directly  obtain  information  from  a  LDB  in  two  ways:  he  can  re- 
fliK'sl  a  listing  of  the  current  stale's  data  and  integrity  constraints  and  submit  a  queiy 
which  is  evalual('d  under  the  Closed  World  Assumption  .  The  listing  contains  all 
clauses  explicitly  stored  in  the  database;  the  answer  to  a  query  comprises  facts  either 
stored  or  derived  from  the  data. 

Tht'  life  of  a  LDB  is  determined  through  a  series  of  states.  The  application-dependent 
c ompont'iits  which  remain  invariable  with  ir'spect  to  all  states,  if'  the  signature  and  the 
iniegrity  constraints,  defiiK'  a  I  ,I)B-scheme.  Iwo  slates  of  a  LDB  can  only  differ  in  their 
Is  <)l  data,  A  data  set  is  consistent  if  it  satisfies  the  integrity  constraints,  viz  the  data 
sf'f'u  as  a  set  o!  axioms  must  allow  the  derivation  of  the  constraints.  A  state  of  a  LDB  is 
always  valid.  i('  its  set  of  data  is  consistent.  I'hus  the  set  of  all  valid  data  sets  is  uniquely 
(fi'lermined  by  the  integrity  constraints.  A  transaction  is  an  activity  which  modifies  the 
t'xplicilly  stoi  f'd  data  of  the  present  state.  If  the  modified  data  set  is  valid,  then  the 
transaction  is  accepted  and  the  LDB  changes  its  state,  ie  the  modified  data  set  forms 
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iii.'  in  vv  i;i')B-s1ate.  Othei-w’isf  the  transaction  is  rejected  and  the  1  J)B-state  remains 
iiiichantred.  ie  the  modification  is  i.ijnored. 

lot  ins  oi)en  i  J)lh  each  user  who  has  access  to  it  can  obtain  a  full  listing  of  the  stored 
i!ai„  ami  integrity  constraints,  receive  a  comnlele  answer  to  any  queiy  and  change  the 
i  I  O'  ■  state  into  any  valid  sta1(  .  An  open  database  treats  all  users  equally  or,  to  say  7  in 
anoiiior  wav,  it  can  only  discern  one  universal  user  with  unrestricted  power. 

[  .loi'i  are  iiiam  I'easons  why  it  may  i>e  m'ccssary  to  restrict  the  freedom  of  a  user.  One 
'  <  isun  is  ih('  i  i'fiuirement  to  keo'j’i  some  eimnents  of  a  1  J)B  secret  from  a  user.  In  the 
bioades!  simst'  wc  define  a  secure  l  .DB  as  a,  Idht  t()g('ther  with  a  set  of  users  who  have 
<h  ^ !  s.-  lo  n  anti  <1  set  of  confidimtiality  i-eqiiir<'ments  that  state  which  elements  of  the 
i  !  '*!’  s’hii-iid  he  lu'pt  seen'!  ii  CMU  winch  user  Ihe  confidentiality  requirements  form  a 
p.  i'i  m  ,!  s!‘<-un1'  policy. 

I  2  Related  work 

.\i  V  (i!  (iing  to  (ioiigen/Mes<'giu'r  tl9b4),  a  conlidcmtiality  requirement  expresses  that 
'mule!  1.  erlain  conditions,  t r  idain  individuals  should  not  have  access  to  certain  informa- 
lii'U  ps  ioi'inalisation  as  non  inP'rlort  ncc  is  s|K'i'ifically  intenck'd  lo  model  trusted 
,  sscs.  bn!  the  authors  also  mirodncc'  a  simpk'  modc'l  of  a  multilevel-secure  data¬ 
base  ir,  prool-lheoretical  viev  winch  has  neitlu  r  integrity  constraints  nor  updates,  and 
■\!ii  !'■  lh(  Closed  World  Assunijilion  (t  W :\)  is  not  mad('.  In  this  context,  they  interpret 
jiii'.  inl''s|,u-cnc(  as  non-(k'nvabili! v  and  sav  lliat  a  security  violation  has  occurred  if  the 
dala  acc  i  ssibh'  at  a  low  seenriiy  li'vi'l  allow  the  fk'fivation  of  data  accessible  only  at  a 
hiid!  sc  nril'.-  l('vol.  Howc'vc;,  ihev  do  not  mention  until  when  a  si'curity  violation  has 
lit'  /'■(  ini'cd  Yei  this  distinction  i'-  important  since  c/  f.  77z(/)  and  a  ^.Th[I)  are  not  the 
mlv  la  lationship-possibilitit's  betwcum  a  formula  and  a  set  of  theorems. 

a  Vtf'sc'yan'r  (1984):7,'i 
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[  )<‘rson  /Lunl  (1987a)  and  Berson/Lunt  (1987b)  investigate  the  possibility  of  the  appli- 
;  alion  o!  tht^  MAC-model  to  deductive  databases.  They  point  out  many  new  problems 
and  suggest  an  approach  to  tackle  them,  but,  due  to  the  initial  nature  of  these  works,  no 
solutions  are  offered. 

Morm'nOern  (1987)  notes  that,  in  a  deductive  database,  making  a  piece  of  infoj-mation 
dire<  tiy  inaccessible  may  not  be  sufficient  if  one  wants  to  keep  it  secret.  Here  it  can  still 
!)('  jiossible  to  inl('r  a  secret  from  the  at  cessibk'  facts,  rules  and  integrity^  constraints. 

Ihr  author  speaks  of  deductivt'  databases  in  an  informal  manner  and  uses  them  mainly 
t> .  a(  t't'ntualt'  this  new  problems  which  arise  during  the  transition  from  relational  to  de¬ 
ductive  databases 

Meadows/jajodia  (1987),  Burns  (1990)  and  Wiseman  (1990)  are  examples  of  early  ap¬ 
proaches  whi('h  consider  a  multilevel  relational  database  where  primary  key  and  for- 
I'lgn  ke'  constraints  are  the  only  classes  of  integrity  constraints.  'I'hey  assume  that  each 
usei  has  access  to  a  different  part  of  the  database,  but  that  there  is  just  one  set  of  con¬ 
straints  which  must  be  satisfied  at  all  levels.  Burns  (1990)  and  Wiseman  (1991)  note 
that  there  is  a  fundamental  conflict  between  secrecy  and  integrity,  since  each  of  them 
t  ail  onK  b('  cMitoi  ced  at  (h<'  ('X|)ense  of  the  other. 

i  unt  Milieu  (1989),  (iamw/Lunl  (1990)  and  (larvey/Lunt  (1991)  choose  an  approach 
which  t  onsifk'i's  deductive  databases  as  a  spc'cial  case  of  object-oriented  databases.  Al¬ 
though  their  motivation  has  its  origins  in  deductiv('  databases,  th('  presentation  is  based 
nn  th('  tc-rminology  of  object-oiTmted  database's.  Hi'uce'  it  is  difficult  to  regard  this  ap- 
l)roa('h  as  a  contribution  to  a  i)redica(('-logic  based  thc'oiy  of  secure  databases. 

Steinke  (1991)  reports  on  a  siv'clfic  problem  in  multilevel  secure  deductive  databases 
which,  arises  wht'n  integrity  constraints  must  bt'  kept  secret.  He  argues  that  it  may  be 
imivissible  to  keep  a  database'  valid.  However,  no  satisfactory  solution  is  eiffered. 
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(  tipi'x  n'-'  Yazdanian  (1991)  extend  a  relational  database  with  horn-clauses.  Similar  to 
Morur-nstern  (1987),  the  authors  consider  the  ‘inference  problem’,  viz  how  to  avoid  that 
,1  if^(  !  can  drav-  a  conclusion  Irom  his  accessible  data  which  should  be  kept  secret  from 
inm.  than  present  a  solution,  the}/'  emphasise  that  logic  is  a  suitable  framework 

cii  (he  study  of  security  probk'ms  in  databases. 

The  lirsi  basic  attempt  of  a  lormal  Inxitment  (4'  confidentiality  is  presented  in 
^huraisingham  1 1991).  Ihc  author  s  main  idea  is  to  formalise  the  multilevel  security 
c.r eperPes  in  N'rML,  a  non-mono1oni(  logic.  Although  this  approach  points  to  the  right 
■tn  erpon,  Y'fMl  has  been  shown  to  be  not  sound. 

ih{'  advane('s  achieved  until  1991  an  di'scribed  in  I'mier  (1991)  as  simply  unsatisfac- 
uc  V  Set  uritv  rnquin'menls  have  been  so  far  neglected  in  the  development  of  deductive 
ffPabasc'''  while  at  the  same  lime  fleductive  databases  are  increasingly  employed  in 
’'ensilivc  anvis.  I onseqiu'nily.  he  urges  a  stronger  formal  security  research. 

!  In  work  of  Monalti/Kraus/Subrahmanian  (1992)  deals  with  the  confidentiality  of  for- 
inniae  in  deductive  database's.  formula  is  secret  if  it  is  not  derivable.  The  authors  al- 
1  )w  tlu  databasi'  to  lie;  their  formalism  and  r('sults  are  based  on  a  mixture  of  standard 
predu'ati  and  an  extended  modal  logic .  Ilowc'vt'r,  tbc'  iiresented  approach  has  some 
"  oak  point'-  <  >n  the  one  liand.  the  autluaw  dotliu'  a  vc'iy  simple'  database  model  which 
!;u  ks  (ho  (  V\  integrity  consti  aints  and  updalo  oix-rations.  Yet,  in  our  opinion,  these 
a.i  :  ecactly  the  component'-  whicb  make  the  confident lality-problc'in  interesting.  On  the 
otiioi  liand.  the}  make'  rathei  unre  alistie  assumikions  on  a  user’s  own  knowledge. 

1  he-e  (  ii'cnnisl.ince's  and  some  of  their  implications  confine  this  work's  applicability  to 
.'i  naiaanc  context.  Finally,  no  motivation  is  providc'd  for  the  choices  made  in  this  ap- 
pi'oacii,  c-g  the-  unit  of  protc'clion,  the  range'  of  answc'rs  and  fhe  preference  or  necessity 
o!  modal  logic  in  e-omparison  to  standard  prc'dicate  logic. 


re-,irv('ye1  at  (1992):  160. 
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Mnally,  the  reliability  of  databases  is  discussed  in  Williams  (1992).  His  arguments  are 
basf'd  on  the  notion  of  external  consistency.*  It  states  that  all  data  in  a  database  visible 
to  a  user  must  be  accurate.  This  requirement  is  as  such  always  satisfied  in  a  normal 
opc'n  database  and  to  a  large  extent  in  a  secure  database  allowed  to  reveal  indefinite  in- 
fonnation  on  secrets  to  the  user.  However,  it  precludes  the  use  of  aliases  which  may  bo 
!;ecessary  in  oi'der  to  attain  an  even  higher  degree  of  confidentiality. 

2  Basic  definitions 

following  (lallaire/MinkcM  /Nicholas  (19(S4)  and  Cremers/Griefahn/Hinze  (1993)  we 
eonsuk  V  flatabases  from  the  viewpoini  ot  predicate  logic.  Thus  the  discussion  and  the 
rt  suits  are  also  valid  for  relational  databases  in  proof-theoretical  representation.' 

2.1  Predicate  logic 

Dcfiintinn  I  .A  signature  I  is  a  pair  Z  =  (FS,  PvS).  The  set  FS  contains  ranked  function 
s>  nibols  aiifl  l»S  ranked  predicate'  symbols.  Hoth  sets,  FS  and  PS,  are  non-empty,  finite 
and  disjunct. 

Dcfhiilu)}!  2  file  set  of  terms  ovt'r  the  signature  Z,  TE^,  is  the  smallest  set  with  the  fol¬ 
lowing  j)ropt'i'ties^  each  variable  is  a  tt'rni;  each  constant,  ie  a  function  symbol  of  rank  0, 
IS  a  h'rm;  Ic't  /  bo  a  function  symbol  ol  rank  k  and  /,.  terms,  then  is  a 

It  rm,  A  l('rm  is  ground  if  it  does  not  contain  any  variable. 

Dcfhrition  2  l  et  rbe  a  predicate  symbol  of  rank  k  and  f,,...,4  terms,  then  r(f,,...,4}  is 
an  fitoinii  formula,  or  simply  an  atom.  An  atom  is  ground  if  it  comprises  only  ground 
t('rms,  G't  ezbe'  an  atomic  formula,  then  ezis  also  a  positive  literal  and  a  negative 
literal.  We  denote  the  set  of  atomic  formulae  over  Z  by  AF^  and  the  set  of  literals  over  Z 

li' Williams  (1992);58. 
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/  definition  4  A  idause  is  a  formula  of  the  form  a,  v. . .  ver,,,  ^  a.  . ,  a/?„  ,  in  which  all 
cariables  are  assumed  to  be  universally  quantified.  Each  cr,  in  the  head  of  the  clause  i‘^ 
e.n  alom  and  each  (ij  in  its  body  a  literal,  A  clause  is  ground  if  it  comprises  only  ground 
ah  ans  A  clause  is  normal  \  f  m  1 ;  it  is  a  quei^  ii  m  --  0.  A  normal  clause  is  called  a  rub' 
i!  n  1  il  is  called  a  fact  if  n  -  0.  A  <'lause  is  range'-restricted  if  each  of  its  variables  oc¬ 
curs  also  in  a  positive  literal  in  its  body.  We  denote  the  set  of  all  range-restricted  clause 
c'”c)- 1  by  Cl/  and  its  subset  of  normal  ( lauses  by  NCl/.*  w 

V\  c  assume  in  this  paper  that  all  formulae  are  range-rf'stricted  clauses. 

ihfmtiov  .')  Ij'\  X  c  CL  be  a  set  of  clause's,  then  Th{X)  c  CL  denotes  all  clauses 
V  i  nch  (  a.n  be  (logically)  deiived  from  X  (for  a  clause  cp,  cp  ^  Th[X  )  is  also  denoted  as 
\  'r-(p)  I  h<‘  set  ol  all  literals  in  Th{X)  is  denoted  by  F{X),  ie  F[X)  =  Th[X)\^yy . 

2.2  Logic  databases 

! definition  (i  A  L!)B-sch<'nie  DB  is  I)B  =  (Z.C),  where  Z  =  (FS,PS)  is  a  signature  and 
i  Cl  ;  a  S('t  of  static  integrity  constraints;  the  present  state  of  DB  is  denoted  as 
,'//»  /,  v'here  I  q  NCI/  .  1  h<'  closure'  ot  /  unek'r  the'  Cleise'd  World  Assumptiem  is  de- 

uote'ei  as  /  ,  ie '/<’{/]=  F(/  )c,i  {-irr I  r/ e  AFX F(  /),  rr  ground  j.  A  state  ef/;  =  /  is  always 
consiste'ut,  vi/  C  c:  heelds. 

Ih'finitwn  7  1x1  /(C)  de'iiote'  the-  set  e)i  all  consistent  data  sets  with  regard  to  C,  ie 
/(f  !  ■"  j/  NCLIt  c  lit)]  )|',  and  le't  db  ■-  /  be'  the'  |)resent  state  of  DB.  lu’om  a  declara- 
li'c<  vH'\v|)oin1.  a  transactiein  ris  a  se-t  A  c;  NCL.  From  an  operatieinal  viewpoint,  a 
transactie)!!  r  =  (A,/)  alters  the*  set  I  intei  X  c  NCL,  which  we  denote  as  / — ^  X  ; 
r  -  (A,  A  is  completely  characterised  through  two  components:  the  set  of  facts  deleted 


\V(“  omit  the  superscript  whenever  the  respective  signature  is  evident. 
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Ironi  /,  S,  and  ihe  set  of  facts  newly  included  intoX  i,ie  S  r\i  =  0  and 
8  i  I )  \  S  ^  )  \  /.  If  X  e  j(C),  then  ris  accepted  and  db-X  is  the  new  state  of  DB. 

The  set  of  all  accepted  transactions  with  regard  to  j(C)  is  denoted  by  T{C),  ie 
/■(fi:  {T(S,i^I~^X,LXez{C)}.<» 

Vl()rcnv<'i  -  w(>  assume  that  the  only  way  to  commuiuc-.ate  with  a  LDB  is  thi'ough  an  in- 
U  iiacc  with  the  following  properties: 

•  LDB  s  response  to  Ihe  command  LIST  is  a  complete  listing  of  I,  C  and  7. 

•  1 .1  )| )  s  !  esi)onse  to  a  qiu'iy  is. 

Syntax  error  il  the  query  is  not  a  valid  query  in  the  language  over  I. 

otlu'rwise,  a  possibly  emi)ly  s(4  ol  ground  substitutions  which  define  a  subset 
..I  ;•■(/). 

•  l.DB  s  rc'sponse  to  a  transaction  ris: 

Syntax  error  if  r contains  a  clause  which  is  not  in  NCL^. 

A-Ccepted  if  TeT{C). 

Rejected  if  r  €  V’l  L) 

•  1  i  h)  s  response  to  any  other  input  is  Unrecognised  command. 

2,3  Persons  and  rights 

1  h('  database  pr<'sented  above  is  an  ()|)en  one  because  it  cannot  tell  one  user  from  an- 
(liluM  it  answ('rs  any  queiy  and  follows  any  valid  ti  ansaction  in  ihe  same  manner.  A 
dataliase  must  la-  able  to  n'cognise  the  users  il  it  is  exi)ected  to  treat  them  differently. 
Therefore  we  add  to  our  database  a  set  /'  of  all  users  or  persons  who  have  access  to  it. 

1  he  rules  of  the  database  behaviour  towards  a  user  are  usually  laid  down  in  relations. 

R(  lations  ex|iressing  rights  and  prohibitions  are  of  a  special  interest  to  us.  We  intro- 
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duce  ior  fach  user  p&P  the  following'  rights: 

•  A‘S,  c  Cl/-''  determines  the  clauses  a  person  may  see  as  an  element  of  /  or  C. 

»  Rl>p  c  RSp  determines  the  clauses  a  person  is  allowed  to  delete, 
e  Rif,  c  Cl/  ''  determines  the  clauses  a  person  is  allowed  to  insert. 

Now  w<  liave  arrived  at  a  database  which  I'ecognises  different  users  and  is  able  to  be- 
ha'.'<  ni  at  cordance  with  the  slated  rights  ’'Ve  call  it  a  database  with  rights. 

2.4  Forms  of  secrecy 

\  x'cure  logit  database  is  a  logic  database  with  rights  together  with  a  set  of  confidenli- 
aiitv  rt'qnirt'mcnts.  Speaking  in  a  colloquial  manner,  a  confidentiality  requirement  is  a 
slateineni  of  the  form: 

d  should  be  kc])!  secret  from  p  with  rt!gard  to  DB 

wht'rc  d  is  a  component  of  14)1)  and  p  a  ust'r  from  F.  A  user  knows  that/)/?  is  a  LDB. 
dims  only  the  (dements  wbicdi  form  a  particular  LDB-scheme  or  a  LDB-state  can  possi- 
!d\  b('  ketit  secret;  the  symbols  of  the  signature,  the  terms,  the  atoms,  the  clauses,  the 
iniegnt'  constraints  and  tht'  data. 

11. 1. 1  f  onlldenliality  of  aloniic  loniiulao 

1  «  t  U'-  (  onsider  the  confidentiality  ref|uirenv'nl: 

(X  (.  should  be  kept  st'cret  from  p  e  /’  with  regard  to  DB 

Sjialka  (  Bt94a)  has  shown  that  the  confidentiality  of  an  atomic  formula  can  be  related  to 
its  ■  lerivabilitv,  it'  to  the  memb('rshii)-])robk'm  with  r('S]:)ect  to  /7?|/ j.  The  following  table 
illustratt's  the  dillerent  possibilities  of  a  relationship  between  an  atomic  formula  and  a 
s('t  of  formulae. 

t  or  the  innnienl  it  suffices  to  know  that  the  sets  of  symbols  of  ,  the  signature  otp,  are  subsets  of 
1  lie  respective  sets  of  Z.  The  motivation  for  the  removal  of  a  symbol  from  Z^  is  given  later. 
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DoC* 

Answer 

Amount  of  information  on  derivability 

Informal 

Form£il 

(GO) 

Yes 

Positive  definite 

a  G  Th(I) 

Gl 

Maybe 

Indefinite 

ava'v  a"v. . .  g  Th(I^ 

G2 

No 

Negative  definite 

a  ^  Th(I) 

G3 

Pon  t  know 

Indeterminate 

a  ^  Th(I^  and  -ictr  g  Th(l) 

(t4 

Don  1  understand 

No  information 

a  g  AF 

!  ii<'  '.'niru's  in  the  table  an'  interpreted  as  tlie  database's  answer  to  the  user  s  question 
‘1  >oe's  (y  ^  rh[I  ]  hold?’  in  the  situation  when  a  g  actually  holds.  There  are  five 
possibk'  answers.  The  first  answer  obviously  does  not  in-eseiwe  the  confidentiality  of  a, 
but  which  of  the  remaining  four  does?  We  see  that  it  is  not  sufficient  to  merely  require 
that  n  should  b('  kept  secrcd  from  p,  in  addition  wo'  must  specify  how  much  information 
on  rrwith  regard  to  /is/i  allowed  to  have. 

At  (i4  the  database  pretends  that  r/ cannot  be  constructed  in  its  language.  It  gives  the 
user  no  information  on  the  relationship  between  rrand  I.  At  G3  the  database  under¬ 
stands  th('  user  s  question,  but  it  jiretends  that  it  has  insufficient  information  to  answer 
It  At  (.9  the  database's  answer  is  a  definite'  ‘No’.  Finally,  at  G1  the  database  tells  the 
use'r  that  it  knows  the  answc'r  but  it  will  not  give'  it  him.  From  the  viewpoint  of  secrecy, 
(ib  K'latf's  to  facts  which  are  not  st'cret.  (t1  is  the  weakest  and  G4  the  most  stringent 
lorm  of  confuh'ntiality. 

loom  now  on  a  confidentiality  requireiiK'nl  for  an  atomic  formula  must  be  accompanied 
'hv  a  de  grcf'  of  confidentiality,  ie  it  is  of  the  form: 

f/  should  be  kept  secret  from  p  with  regard  to  I  at  the  degree  of  confidentiality  G 
whc'iT  (i  can  be  Gl,  G2,  G3  or  G4. 


Hf-gree  of  Confidentiality 


Sp.iika,  i!4 07  1994 
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1  h;  int(‘rpre1ation  of  G4  in  a  LDB  is  easy  sino-  all  we  have  to  do  is  to  remove  one  sysn- 
hill  i'rMil!  an  user  s  signatoH'  be  would  need  to  eonstnict  the  secret  fact.  However,  it 
^hou!(i  ire  kept  m  mind  thai  '  4'rennirements  may  unreasonably  deteriorate  the  usefub 
ir  ■!>!  (he  database,  sinc(  the  roinovai  ot  one-  sysubol  from  a  user's  signaltire  rediict's 
in'-,  language  a  whole  group  of  clauses, 

■y  !  is  tnviaiiv  noi  satisfiahle  in  a  1,1  )B  with  the  <  vV.A.  I'he  CWA  tells  us  that  for  eacl: 
anuvi  /)'.  <  ith<  s  n  e  or  -a  >'  Fi! )  holds 

-i  h(  inleipretalion  of  G2  is  .again  simpb  Here  fh<'  ('WA  tells  us  that  a  i  F{I  )  “> 

!Y '  F\ !  i.  le  tile  non-derivabilily  of  the  secret  tact  implies  by  default  the  derivability  of 
it'  iu  gahoii, 

in  a'llei  ji-  iilnstratr'  the  inlerpr<  lation  ol  (H,  let  us  consider  the  following  example. 

Fxampir  !  !.et  Z  =  (FS  \a\,  PS  --  and  C  =  |/)(A)v9(A) e- s(A)|  be  a  LDIO 

S.  iieim  visible  to  a  user  u.  i  I't  moreover  F[I  I  [/)(«), s(a)j,  and  p{a]  should  be  kept 
''I  :  re!  Irom  n  with  regard  to  /  at  the  degree  of  Gl.  I'hen  p[a)  must  not  be  derivable  for 
i!  rims  wo  rt'diice  w's  set  of  iiositivo  data  to  Fi  /„ )  -  j5((7)|.  Now  the  trouble  is  that  /„ 
Hoes  not  satisly  (  .  and  wc'  ow('  the  user  an  ('xplanation.  We  siigg('st  to  tell  him  that; 

•  ibo  iritf'grity  constraints  in  t  'ari'  always  satisfied  by  the  data  in  / 

•  his  data  may  seem  to  violate  f’diu'  to  sonu'  si'crets 

Niiv  lli(  iis-’v  IS  able  to  identity  the  violated  constraint,  and  through  a  sim]4e  substitu 
liisn  in'  ran  liiul  out  that  /i(r/l  ■  (/{a]  e  Tlt{  I )  holds,  viz  ('ilher  p[a)  or  is  Q(a)  true.  It  may 
b(  Pia\  but  iii:n-  as  well  not  bt'  p{a].  • 

VVo  s('('  that  the  interpretation  of  Gl  in  a  LHIf  involves  some  interactions  and  new  con¬ 
ventions.  One  can  object  that  it  is  hardly  acceptable  if  confidentiality  depends  only  on 
two  choic('s.  eg  p(a)  or  q{a).  However,  the  integrity  constraints  as  such  are  given  by  an 
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ai)i)Iicali()n  and  (t1  only  interprets  them.  ITius  if  the  relevant  integrity  constraint  is  a 
(]( 'finite  t'laust',  then  it  may  be  impossible  to  keep  a  fact  secret;  if  on  the  other  hand  the 
constraint  is  a  disjunction  of  7000  links,  then  G1  seems  to  be  fairly  strong. 

Although  (il  is  the  weakest  form  of  confidentiality,  it  has  a  number  of  important  advan¬ 
tages  over  the  other  forms.  Firstly,  it  does  not  lie  to  the  user.  It  only  provides  him  with  a 
weaki'r  information  than  il  is  capable  of,  but  this  information  is  still  true.  If  we  contem- 
niale  the  irossible  consequences  of  a  lie  from  a  practical  and  ethical  point  of  view,  then  it 
s(  eins  iireferable  to  give  imprecise  rather  than  false  information  This  viewpoint  is 
shared  by  many  researchers.  Secondly,  th('  enforcement  of  G1  can  be  achieved  with¬ 
out  cover  stories,  viz  explicitly  introduced  lies.  Being  compared  to  it,  G2  may  require 
iIk'  maintenance  of  a  consistent  set  of  lies  which  greatly  raises  the  effort  needed  to  en¬ 
force  It. 

2. 4.2  Confidentiality  ot  clauses 

Most  iirt'vions  approaches  have  defined  the  confidentiality  of  a  clause  as  non-deriva- 
bility.  i('  a  clause'  ip  is  secret  for  a  user  p  with  regard  to  a  set  of  clauses  /,  if  p  cannot 
conclude'  that  (p  e  Th{l^  heilds.  Wt'  bt'liewe  that  this  definition  is  inappropriate.  I>et  us 
conside'r  two  meetivating  e^xampk's, 

kxampic  2  Ix't  db  =  I  =  |r(c/)}  .  the'ii  the'  clause  (p  =  r{X)<^q[X)  is  derivable  from  1. 

't  e'l  the're  is  ne)  substitutiem  which  makes  ^  s  beidy  true';  its  derivability  depends  com- 
pk'te'lv  on  the'  de'faull  data  given  by  the  CWA.  In  particular,  ^remains  derivable  even  if 
W('  substitute  any  other  pre'dicatf'  symbed  eif  the,'  signature  for  q. 

Example:!  lot  F(/)  =  then  ^  =  r(X)  e?(A:)  is  deriv¬ 

able'.  In  order  to  make  (p  nein-derivable,  it  is  already  sufficient  to  adjust  the  data  in  7  so 
that  eg  e'ither  r{a,  )  is  deleted  freim  F[I),  or  ^(e2„„„ )  is  inserted  into  F{I). 


CSFW  (t994). 
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)i^  I  hr  fir^l  example,  the  clause  is  trivialty  derivable,  l)Ut  this  informalion  seems  to  be  of 
aolc  ;u'at!  hi  lh(  second  example,  the  claust-  is  no  lon,£?er  derivabb*  as  soon  as  there  is 
lusi  oiif  substihition  which  ■  lakes  tlie  dausi' s  body  true  while  leaving  its  iicad  false, 
!;ni  ■  ;i!  we  nist  neglect  the  1000  other  substitutions  where  this  is  niu,  the  case?  VVhen 
■  ■  -..i’  !]i,i1  a  clause  is  confidential,  do  we  just  s-egard  the  clause  as  any  othe-r  formula, 
w.  MU  •  nnplu'itlv  express  that  a  e-onridcntiai  ntle  derives  confidential  data?  I'here  are 
•presuma'hbv  ik*  lormal  grounds  on  winch  this  fjuestion  can  be  answered.  However, 

Or-.  !u  ;j  In,  out  motivation,,  'M  iieli('',  e  that  the  s.t'confi  |,)art  of  this  cjiiestion  sliould  he 
will-  ’Vos’. 

'Am‘ iher. loiM  invu  the  billinviny' (b-iiiution.  !,(  t  ^  he  y?  ajV...vo,.,  <  /b  '  -e/f,,.  Ihe 
■  : anidfUii.slit V  rorjuirememt 

(P  should  ho  kept  secret  from  Ihe  user  u  with  regarel  to  /  at  thf-  degree  of  G 

i  -  inu  rproti'd  in  a  LDB  as  lollows' 

/ '  CD  e  /.,  and  cp 

n'  !■  or  all  ground  suhslimiions  tI  //  a  a  i  so  that  e/i,,  Th(l],  it  is  n'- 

umref!  that  7:{a^v  ,  yc,  j  slionln  he'  kept  secret  fromg  with  n'gard  to  / at  lh<' 
d<  'ure(  ( ! 


\n;>los.-ous  to  ih«  previous  se'clion.  ( i-l  se'creev  oi  a  riaiise'  is  linke'd  to  the'  user  s  signa- 
iiio  ,  and  a  e  lanse  canneit  lie  ke'pl  se'cre'l  at  the  rjeagrer  (,3  due  to  the'  t'WA.  The  G2- 
;lug5  MU  iiie-ans  that  ;r(o'|\M  o,y,  i  g  77?(/ ).  Sincu  /rfcD  ,va,„)  is  a  disjunctiem  of  ground 
aiuni'c  ihiM  can  only  be  satisfied  when  <'ach  link  of  the-  disjunction  is  not  de'rivable.  'Fhc 
( !  i  he  gre'e  doe's  not  seem  to  make'  se'iise-  for  indefinite  clauses.  However,  (il  has  a  sim¬ 
ple  inte'rjM'e'talion  for  definite'  clause's,  e'g  q)  ■■=  a  <—  /f,  a.  ,  .Ay5„: 


cp  /,,  anel  cp  g  C, 
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•  all  ground  instances  of  the  head  derivable  from  the  present  state  's  data,  ie 

F{  I),  should  be  kept  secret  at  the  degree  G 1. 

Ihnally,  we  note  that  the  given  interpretation  of  a  clause's  confidentiality  is  a  proper  ex- 
Umsion  of  an  atomic  formula  s  confidentiality.  If  the  clause  is  definite  and  has  an  empty 
l)ody.  then  its  head  is  a  ground  atom. 

,M.o  vSome  remarks 

A  eompleU'  discussion  of  Ihc'  confidentiality  of  symbols  of  the  signature  and  of  terms 
can  he  found  in  Spalka  (1994a). 

In  oui  o|)inion,  one  of  the  main  advantages  of  the  presented  formalisation  is  that  it  is  no 
longi'i  necessaiy  to  use  ambiguous,  colloquial  descriptions  of  confidentiality-degrees, 
eg  lullv  siH'nl,  disclosure  of  the  existence,  partial  disclosure  of  secret  information.  To 
give  an  I'xample,  when  is  an  atomic  formula  r(f,,.,.,4)  partially  disclosed  to  a  user? 

•  When  the  user  knows  that  the  symbol  r  or  some  terms  are  elements  of  the  da- 
1  a  base-language? 

•  When  the  user  knows  that  ?'(/, _ )  is  a  valid  formula,  viz  the  database  recog¬ 

nise^  it? 

•  Wh('n  till'  user  knows  that  some  atomic  formula  comprising  t,  is  derivable? 
riii'  rlegrees  (il-G4  avoid  this  confusion. 

Hus  paiK'r  invc'stigates  only  tlu>  enlorci'nK'nt  of  Gl.  It  is  the  weakest  form  of  confiden- 
lialitv  and  vjv  do  not  say  that  it  is  suitable  for  all  situations.  Yet  we  feel  that  it  may  be 
ajipropriati'  for  some  situations  and  we  know  that  its  investigation  is  a  necessary  pre- 
liminai>  step  for  the  investigation  of  G2,  which  seems  to  be  the  commonly  preferred 
torm  of  confidentiality. 
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2.b  Personal  database  profiles 

i  i-i  ;  9,  hr  daUlUase  with  ir  .wbrnir  OB  ■-  \  l.,i  ')  and  the  staU'  dh  / ,  Hk'  a])n]K 
)i  RS  Hh  i  urlii  )  see,  to  •  orovidrs  lor  earh  user  />  his  profile  V>  ■  tl„.(  ,  1  an<’ 
t''  /  ^  ks  a  dierequii-  inents  to  ihr  pr>'fi!p  sr  d];i!  p  satisfies  thr  imfideniiaiilv  ;  ■ 

.  noi  rn.riiis  to^  he  user /»,  itui  ihen-  is  iuon  h:an  i!us  Our  starhnp'  point  has  been  ar 
■  s'ci,  iaiai);):-'  f'hen  we  liav(  addrd  iis^'^sa!!fl  loil.  If  a  us<  r  possess('s  all  nphis 

bu  ii  h*'- prold'  'identical!-  tlx  whoir  dataha  -  Odu'rwise,  his  profile  is  diffiuh'n! 

!!  !iv,  ii  hhouid  lh('  databasf  sriuaiilK  s  of  the  e  ;  (,)]!  database  or  of  a  firofile  he  a.lloHs'.'i 
t  ■  van,  d<  pciKtiiip  on  the  aOual  scthnp's  o!  the  .nqhts  We  maintain  llial  the  flesirahie 
,0  ""  ;-i  i‘  in  h.irii  rases  ‘Nr  VVV  wuuid  iikt  tr  i'X'k  osi  a  profile  as  an  inde]X‘ndenl  open 
si.it  it. as;  whu  ii  M'speets  !h.  '  alidit'  eittlx'  .-i-iMis'  dala.hase.  W('  slate  ioui  e-xpretations 
iilis)!!!  the  srinantu's  ot  a  (iaiahas.  with  ripiils 

'  =  riu'  oritniial  databa--  />/•’  .  f  Z.O)  and  dh  ■-  I .  is  valid  if  (’  e  llt^I ). 

'/•  •'  \  ]iro(i!(  DB.^i'L  J  j  and  d/>  -  is  valid  i!  c;  772(7,  ). 

I'eH  '.aiidily  of  a  prsidii  is  subordinale  l<a  iIk'  validity  of  (h('  original  dalabase: 

,,  772(7/,)  .<  r  77/(/) 

f  i  iu'  v-iiidily  of  two  liillcrriii  protilt's  is  indi'jX'iidenl  from  eaeh  otlu'r. 

e-.sot-.:  :h  ,iiih  sii;  rc'state  IIt  hniriains  nla!  dsdinipon  ni  integ;nly  in  dabibiises,  1  n  qiie'. 
lir:’  ;5i’-\  on-"  !>i  lixan  nu'ans  ’o  iiuesbon  ints'eniw  eriisiraints  as  sir  h.  vi/  ws-  would  ne 
irnyei  lalk  alxiiii  flalabast  ^  |■((ln!  (nil  alksws  tiir  llx'  rcR'etion  of  an  update  eommand 
!)\  p  ( .  -  n  if  th(' n'Sisaiiip  riali  #77  /  ?■- nilid  i'his  can  be  the  eas#' when  f 

i  entaim#  -li-oiienr  i)roi)erti!  ■  tiuin  (  I'tus  is  *.  onsisteni  with  our  formalism  and  can 
ms'lni  in  praelier,  'hhest'  insukaaitions  show  th;it  it  no  longer  mak(>s  sense  to  ask  r'  a 

r>  aa  s'liii  I  xps  i-eaes  also  th;ii  I  hr  inU'erity  ol  noii-roiilKlriilial  data  implies  llie  inte.qrity  of  confirlentiai 
isiia.  I'b-is  nun  soiiiul  strange  a!  fii'si.  Howi'Vf'r,  the  wlioir  database  comprises  sill  data  and  its  integrity 
iii.ii-  iK'vi'r  bt'  violatcfl  -  this  is  ;t  liindame>ntai  pi'operly  of  a  database. 
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dal  abase  with  rights  is  valid  when  we  have  the  definition  of  validity  of  an  open  database 
in  mind.  We  therefore  give  a  new  definition  of  the  validity  of  a  database  with  rights  DB. 

We  st  ill  say  that  DB  is  valid,  if  the  state  db  =  I  is  valid.  But  we  say  that  DB  is  locally 
valid  lor  a  p  g  F  i  f  DBp  is  valid,  or  simply  that  DB  is  locally  valid  if  it  holds  for  all  prfv 
liles,  and  we  say  that  DB  is  globally  valid  if  dh  =•- 1  is  valid  and  DB  is  locally  valid,  ie  all 
profiles  ,ire  also  valid. 

in  Ibis  light  th('  notion  of  global  validity  of  a  database  with  rights  seems  to  be  the 
mate  inng  counterpart  to  the  notion  of  validity  of  an  open  database. 

Point  (iii)  requires  some  final  explanation.  Animated  by  other  authors,  we  have  also 
conlemifialed  the  situation  where  a  command  is  rejected  because  C  <xTh[l^,  even  if 
( e  l'h[I .. )  would  hold.  We  exclude  it  for  the  following  reasons: 

•  I'he  user  p  has  been  granted  his  rights  on  condition  that  he  is  trusted  to  make 
ns('  of  them.  To  us  it  seems  judicious  to  provide  him  with  an  explanation  for  the 
acceptance  as  well  as  for  a  rtqection  of  his  actions  authorised  by  these  rights. 

•  W('  have  considerable  doubt  whether  it  makes  sense  at  all  to  talk  of  a  database 
li'om  the  user  's  point  of  view  when  the  jiart  of  the  database  seen  by  him  exhibits 
random  behaviour.  W(‘  could  tlum  omit  Cp  completely  from  his  profile,  since  he 
would  never  know  if  a  decision  made'  by  C,,  is  not  overruled  by  some  invisible 
authority. 

•  hinally.  the  formalism  the  databas('  is  based  on  would  be  of  no  use  for  the  de¬ 
termination  of  the  risks  of  disclosunc  An  autonomous  profile  gives  the  user  no 
opportunity  of  finding  out  any  pro])erties  of  C.  In  the  other  case  the  database 
would  not  have  the  slightest  idea  of  the  information  which  the  user  already  has 
deduced  and  will  deduce  from  its  behaviour. 


Si5;iika,  04,(iV,  1994 
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2,6  E^xtemal  assumptions 

\hli()H!,r!i  wp  havx,'  taken  a  theoretical  approa('h  to  security,  it  would  he  inaitproinia^  ‘  ^ 
neeied  the  real  context  of  an  (environment  whf're  a  database  may  be  list'd.  Fht'rt'  aiv 
liii  ve  inain  areas  of  concern  outside  t’ne  databast'; 

a  1  lie  inioi  illation  stored  in  tht  database  a  user  has  already  known  of. 

*  1  he  prot'ess  of  right  s-ttranl  my. 

•  sofiwaro  ,;ind  hardw-are  manipulations 

;  ti  1  InriivKhial  knowledge 

hen  v'-o  laif  aiie,ut  a  dataiiase  willi  usors  wlie  art'  I'i-al  persons,  wt'  must  lake  fht'ir  in- 
ni  .'i'liial  knoV'icdpo  rt'levanl  to  iIk  datahas!'  contents  into  account.  The  nt'cessily  to 
modo]  iho  nsi'i  hnowledgi  !i;is  bt-t'n  rt'ct'ntly  strongly  emphasised  by  landwehr  and 
1  al'adula  A  pei’son's  oW'U  knowledge'  is  t-vidt'ntly  not  subject  to  the  allocation  of  rights 
in  die  database  iluis  it  can  h<'  eonsidt-ivd  as  the  lower  bound  on  the  size  of  a  person  s 
iiaiahasf  pi'olile. 

O.  ■'  fli neiti  th'-  properties  ol  i  known  to  p  hv  h,..  wt-  call  Aj,  the  expectations  of/)  with  rt'- 
e  li  d  !•>  die  eonieiits  of  in  orfk'r  tlial  the  dala!)as<  is  able  to  take-  /A,  into  eonsidera 
po'i  !i‘  i  ontenis  must  lay  within  the  expressivi  power  of  the  daif,ihase,  lienee  we  as 

that  /’  .  ('I,.  Moreovi-r  we  as'-nme  tluH  p  s  pe-rreption  eonfonns  to  thf-  notion  of 
■  .idfii'.  as  It  is  nndi'rstood  t)\  the  databasi',  le  ',7;  f  /'■  ( '  c,  77/(7)  0  E,  c  77/17).  Note  that 
!  al'-  i  siillu  lent  to  t'xiin  ss  />  s  kiiowiedge  of  any  pii'ce  of  information  from  /.  In  this 
pans;  "a  assume  that  an  E.,  is  given  lor  ('aeli  p. 

,  ■■p'V  ,l(ll).p 

i>  f!ii  a  V  {•  .an  also  inn'ml.u  <■  iowoi'  Ijonnils  on  (lie  size  of  the  signature  and  the  set  of  clauses. 
iT.i'  !!o\>'t'-.'ei'.  is  not  relevant  ti'  this  j)a|)er 

'■  ,10  .veil  awiire  of  the  difficulty  in  df'lennining  the  expectations  of  a  user  in  practice  and  it  is  obvi- 

.;:s!"  on!  -m  (iiu'stion  that  safe  clauses  (.-an  only  capture  a  part  of  the  relevant  user  knowledge,  but  still  we 


\  SiMlka.  040/1994 


dhs<*c94.floc 


18 


2  n.!?  I  rustworthiness 

<  >iii  flalnbrist'  rerognises  rights,  but  the  decision  to  grant  or  refuse  a  right  is  madt'  out¬ 
side  it  Who  makes  the  decision  and  what  are  his  criteria?  The  first  part  of  the  question 
i:  .  tor  MUX'  ol  gix'at  practical  unportance,  but  it  lias  no  relevance  whatever  to  the  prob- 
Iriii  ol  ^  onfidentiality  when  act.  (irding  to  thi'  criteria  only  reasonable  decisions  are  aj'. 
le  ivrd  of  I'he  motivation  lorthecrilena  is  given  by  bindwehr  (1981); 

When  a  document  is  not  in  a  safe,  it  is  in  the  custody  of  some  individual  trusted  not 
•  disiribiite  it  improperly. 

I'his  nu'ans  that  a  person  is  only  granttO  a  right  if  he  can  be  trusted  to  comply  with  its 
intended  use.  In  jiarticular,  we  may  assume  that  the  refusal  of  a  right  to  a  person  within 
tlu'  database  will  not  be  circumvented  outside  th('  rlatabase  by  other  persons  possess¬ 
ing  this  right-  When  the  right  to  so'e  is  considered,  a  specific  confidentiality  is  usually 
assigned  to  the  document,  it'  the  objt'ct  of  iirotection,  and  an  individual  trustworthiness 
to  each  pt'rson.  I'he  condition  that  the  iierson  s  trustworthiness  is  adequate  to  the  ob- 
jt't  I  s  t onfidt'utiality  by  some  kind  of  measurt'inent,  is  necessaiy  but  not  sufficient  for 
itu  granting  of  this  right. 

(  Mir  dalaiiast'  with  rights  makt's  tht'n'fore  the  following  assumptions; 

•  the  granting  of  a  right  to  a  person  is  always  justified  by  this  person's  trustwor- 
Ihiness  which  is  established  outside  tin  database'. 

•  \  person  always  behavf's  in  accordance'  with  the*  e'xpectatieins  implied  by  his 
Inistweirthiness. 


b<'liev<'  ihat  it  is  a  gooel  point  to  slarl  with, 
i.iinrlwehi'  (1981):2.t0. 

( ine  should  note  that  this  assumptions  do, not  imply  that  a  person  is  trusted  to  see  confidential  data.  If 
this  person  is  able  to  trick  another  person  to  obtain  confidential  information,  then  there  is  nothing  the  da¬ 
tabase  can  do  about  it. 


J 
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2.(rd  Others 

!  hr  s( curily  provided  by  a  database  is  just  one  component  of  an  ove>-a]l  security  ^ 
V\'(  d(i  not  consider  any  software  attacks  like  Frojan  Morse  Program'-,  the,  nor  any 
irirdwait'  attacks,  like  theft,  wire  tapping,  etc.  Fhese  issues  are  related  to  a  particni 
impk'inentation  but  not  to  our  logical  model.  V¥<-'  take  only  those  risks  into  account 
^  Incti  .  ill!  be  ('xpressed  within  our  formalism,  viz  which  the  database  can  recognise  in- 
ft  pendent  of  its  actual  implementation.  We  make  tlu'  following  assumptions; 

•  1  he  onlv  way  for  a  jierson  to  get  a  jiiei-e  of  information  stored  in  the  database  is 
through  tlu'  databasi'  intt-rfaiu'  mtemded  for  communication  with  this  person. 

•  Any  jiiei’c  of  information  ;i  person  may  otherwise'  get  is  in  tum'  with  the  data- 
Ivise  s  present  assignment  ofnghis, 

3  Enforcement  of  G1 -secrecy 

!  iiis  jiaper  eonei'iitrales  on  the  enloreement  of  (il  only  in  logic  databases.  In  this  s('c- 
pen  "  e  investigate'  tlie  asiieets  of  th('  various  forms  of  reasoning  about  tb<'  interacti^ms 
heiween  a  iisi'i'  and  a  database  witli  regard  tot.  Fseeri'ey.  The  primary  jairpose  of  this 
seepon  IS  lo  enable  the  database  to  )•('eogmse  wIk'm  a  si'cret  fact  may  ]5ossibly  l)e  di;-- 
•  iesed  lo  .1  iist'i  For  this  reason,  oiii'  investigations  do  not  extend  beyond  the  exprt  s 
'-.r  e  power  o!  oin  database, 

i  !  !  (r  denote  the  set  of  faels  whirl’,  siioulr!  be  k<'])t  sceret  from  the  user  p  with  n'gard 
;e  -p'  !  at  tlie  ( :  l-degi't'e. 

3, 1  (t1  and  the  right  to  see 

V\  e  i  ommence  by  considi'ring  of  the  ease  in  whieb  the  iiser^  is  only  an  obseiwer  of  his 
|)rofik',  that  is,  his  rights  RD  and  R1  are  empty.  In  his  profile  ),  we  set 
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(  „  -  kp  because  Cp  is  not  involved  in  the  process  of  answering /)'s  queries,  and  p  is  not 
allowed  to  issue  any  update  commands. 

i'irst  ol  all,  as  a  formalisation  of  the  external  assumptions,  we  may  assume  that  no  se¬ 
cret  fact  is  already  known  to  p!  Secondly,  we  also  assume  that  no  secret  fact  is  among 
I"  (/,  ).  Otherwise  the  database  will  tell  p  a  secret  whenever  he  likes  to  ask  about  it.  hi 
nally.  the  user  s  state  db^  -  /  should  be  productively  useful,  ie  it  should  contain  exactly 
th('  information  which  the  user  needs  to  do  his  job.  Now  we  face  the  problem  that  such 
a  state  may  be  invalid.  We  have  to  consider  that  Kp  places  a  lower  limit  on  the  size  of 
dtr  --  / ,,  ami  we  may  not  adjust  Kp  s  contents.  Thus  an  Ip  which  does  not  tell  p  any  se¬ 
ct  el  may  well  be  too  small  for  Kp.  Ihere  are  three  possibilities  to  handle  this  situation: 

i)  We  can  tiy  to  enlarge  db^  =  to  a  valid  set  by  including  some  clauses  oi  db  =  l 
which  need  not  be  kept  secret  from p  and  which  guarantee  that  F{lp^  still  does 
not  comirrise  any  secret. 

u)  We  can  t'ularge  db ~  ^  i,  into  a  valid  set  if  we  add  all  the  clauses  primarily  re¬ 
sponsible  for  its  invalidity  to  it. 

ill)  We  do  not  adjust  db,,  =  7^,.  Instead  we  tell  p  that  his  current  slate  is  invalid  due  to 
some  secrets. 

1  h<'  first  possibility  is  obviously  the  most  cU'sirable  one,  but  its  application  depends  on 
the  <  onl('nts  of  tb('  present  sla(('  db  —  /.  II  (here  are  no  suitable  clauses  or  we  are  not 
willing  to  show  p  clauses  ol  which  he  has  no  need  to  know,  then  point  (ii)  suggests  to 
r('linquisb  sonu'  secrecy  -  this  is  rather  a  surrender  than  a  solution.  We  believe  that 
wiii'iuwcM'  point  (i)  cannot  be  applied,  we  should  follow  point  (iii)  and  see  whether  this 
pi-('s<'iwes  (il-secrecy.  To  admit  point  (iii),  we  say  (hat  a  profile  DRp  is  weakly  consis¬ 
tent  it  then'  is  a  subset  N  c;  Gp  so  that  C,,  c  Th{] ,,  uS).  Similarly,  we  say  that  a  data- 

tf  this  is  not  Iht'  case,  then  some  external  security  precautions  have  failed  to  keep  it  back  from  him. 

.Sometimes  we  caii  a  valid  state  strictly  consistent  in  order  to  draw  a  distinction  to  weak  consistency. 
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base  with  rights  is  globally  weakly  consistent  if  there  is  a  weakly  consistent  profile.  One 
should  note  that  we  never  allow  DB  to  become  inconsistent.  Inconsistency  is  only  loie!' 
aled  as  a  local  phenomenon  of  a  profile  due  to  the  confidentiality  requirements. 

!l  /f/;v,  is  valid,  then  -  from  the  database's  point  of  viev/  -  secrecy  is  pi'eserved.  Wli- 
can  p  lind  ont  if  his  profile  is  only  weakly  consistent?  He  can  identify  lh('  violated  in'.!.’'  ;- 
ntv  constraints  of  Cp  and  then  look  lor  the  reason  of  the  violation.  Let  us  assume  that 
iq  ■ .  . e- -  A. . . G  (A,  is  a  weakly  consistent  constraint,  ie  it  is  false  due  to  a  se¬ 
cret  fact  It  can  only  be  violated  if  all  fi;  are  true  and  all  a,  are  false,  viz  there  is  a  suhsti- 
futioii  .7 so  that  7(y9,  a.  .  )  e  )  and  v. .  ,vor„/)  g  However,  weak  con- 

sistsMicy  tells  thi'  user  that  7j//,  v. .  )  e  777(7).  Firstly,  we  see  that  weak  consistency 

is  (  onco!  dant  with  the  deiinition  of  ( 1 1 .  Se(,'ondly,  we  see  that  a  definite  integrity  con¬ 
straint  is  not  ahk'  to  preservt'  secrets. 

VVh(Mi  lh('  user  is  left  with  7(rr,  -.-...vex,,, ),  m  >  1 ,  as  the  result  of  his  search,  then  each 
/r\  (x. }  reiires(‘nts  an  equally  well  suited  candidate  for  a  secret.  What  else  can  he  do  to 
rediu'e  th('  numbm-  of  candidati'si'  1  Yincipally.  h(‘  could  check  his  rights  BS  and  find  imt 
that  1h<'re  is  just  one  candidal('  which  would  not  he  visible  to  him.  To  close  this  gap,  we 
assume  that  a  user  s  rights  mauilest  tliemstdves  only  through  th('  interaction  with  his 
profile  i'h(  only  remaining  way  (within  tlu'  datahas(9  for  the  user  is  to  simulate  insev: 
comniands  lor  ('ach  candidate.  To  k('('p  the  secret,  there  must  b(‘  at  least  one  more 
caudulate  th('  ius(srti()n  ol  wliich  would  k'ad  to  a  consistent  i)rofik'.'  This  mf^thod  can 
lU  '  scive  lh('  secr('t  fact  which  is  responsible  lor  th('  weak  consistency  of  the  constraint, 

'  f  1  it  ma\  disclose  anothc'r  sec'na. 


N(iU'  dial  il  ('  is  more  than  one  substitution  like  /r.  then  each  one  identifies  a  secret  fact  since  oth- 
ci'wise  /U/y,  woukl  not  be  weakly  consistent.. 

Since  /)  has  no  insert-right,  h('  is  iorced  to  perform  these  ti’haf-i/ operations  outside  the  real  database 
but  the',’  are  still  within  the  expressive  power  ol  our  formalism. 

ct  hxample  1,  p  1 1. 
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l-Mmplr  4  C,  =  {r(.¥  I V  s(;f )  ^.  p{X)}.  /,  =  {,(4  v{X)  ^  r{X),  v{X) s(;f )},  and 
I:,  -  ((-(al,  v{a]j.  This  profile  is  Weakly  consistent  and  preserves  Gl-secrecy  of  j'(b),  yet 
the  secret  fact  is  disclosed. 

Definitron  S  I.et  ^3  =  orj  v, ,  .%/«,„  e-  a.  .  .a/3„  be  a  weakly  consistent  constraint.  ;r(a,)  is 
ri  t  andidatc  tor  a  secret  if  /,  v..;r(a, )  is  consistent,  (ph  Gl-secrecy-preserving  if  there 
are  at  least  t  wo  candidates  for  a  sfxret  and  there  is  no  secret  fact  d  such  that 
(i  fe  F{  /  ,  j  {;r|Vr  )})  holds  for  all  candidat('s  ;r(«,.). 

itacdi  violated  but  Gl-secrt'cy-preseiwin-r  constraint  can  be  separately  considered.  I^t 
;a  , . .  ,  /y  be  these  constraints  and  let  K.,  denote  the  disjunction  of  the  candidates  for  a 
secret.  Weak  consistency  diu^  to  the  simultaneous  violation  of  all  y,  tells  p  only  that 

^  aK,.  errhif). 

Lemma  I  A  profile  DB^,  is  (j  l-secrecy-preservinp;  for  RS  if: 

•  No  secret  clause  is  among  Cf,  or  7^, 

•  No  secret  tact  can  lie  derived  from  Ip 

•  DBp  IS  consistent,  or  it  is  weakly  consistent  and  each  violated  integrity  constraint 
is  ( I  I-secrecy-pres('rviiig. 


3.2  (;i  and  the  right  to  delete 

I  et  DB,  be  a  ( .  1-secrecy-pi  esei-ving  i)rofik>  for  RS.  Wv  now  assume  that  p  has  the  right 
to  delete'  some  data.  I.fc4  tp  e  77/),;  we  assume  that  the  state  db;  =  i;,  consequent  upon 
th('  ace-eptanc('  of  the  DELETp:  cp  command,  does  not  contain  cp,  and  we  denote  by  5^ 
the  set  ol  atoms  which  have  disappeared  from  /''(7/,)  together  with  cp,  viz 

^  I,)-  I  his  implies  that  a  secret  fact  which  has  not  been  among  7'"(^7^j,is 

not  among  E{Ep )  either.  We  therefore  concentrate  on  the  integrity  constraints  in  the 


NotG  Iliat  a  delete  command  is  accepted  in  contravention  of  Cp  if  the  new  state  is  weakly  consistent. 
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consideration  of  the  followine^  possible  eases; 


DB., 

! 

DF.  LFTK  (p 

db; 

1. 

Sti'lCt 

ai'ceplcd 

strict 

2. 

strict 

accepted 

weak 

O. 

st  net 

rep'ctcd 

(no  change) 

4. 

w('ak 

acceirted 

strict 

5. 

vveak 

;iccept('d 

wi'ak 

6. 

wt'ak 

rt'jccl('d 

(no  change) 

1 1 )  inis  i)een  sec^eev-l)rt'S('rvln^^  so  is  /)/f since  all  constraints  are  salisfif'd. 

(4)  bet  ]•  cr,  /.  ..vrr„,  t  /r  -  /f,  be  unr  ol  the  violated  constraints  of  Cy,.  The  fact 
Ibai  ehas  b('en  valid  in  DB.,  but  is  not  in  DB',  (whik'  it  retains  validity  in  DB  )  means 
Ilia!  l!ier('  is  a  substitution  tso  that  some  )&<->,.  Thus  yis  only  secrecy-presen’ 
iny  d  A  -,  with  all  its  links  in  removed,  is  also  secn'cy-presei'vin.tt.  4'his  point  diverts 
otii  attrition  to  the  fact  that  to  Inmdie  update'  commands  we  must  maintain  a  history 
''Uirliny  at  a  particular  slate-.  All  atoms  from  ks  head  />  knows  to  have  dekOed  can  no 
l!.■uyc^  heion.e  to  AT 

bet  A,,  = -[xl  ,V  i  s  r(  A’ i  <  -  and  ff,  |5fft)|.  After  tfie 

i!‘--er  has  delcU'd  r(r/),  the  piofile  is  weakh’  consistent,  but  the  secn'l  is  disclosed. 

eti  ‘  ui)  Accordiuf^  to  our  delinilion  ot  a  profile  DB^,  as  an  independent  databas- 

'vfii!  ii  behaves  in  conformily  with  DB,  the  command  is  rc'jected  sinc('  C^,  a.  a^i  ‘ 

ue-  seeri'ts  ai'i'  involved  in  this  dc'cision. 

( t  )  1  his  cast'  is  secrecy-iirest'rvin^.  Here  a  previously  weakly  consistent  constraint. 


riic  censli-aint  y  is  an  indefinite  danse,  lor  otlierwise  the  delete  command  would  have  been  rejected. 
( itlierwise  the  command  would  lead  into  a  weakly  consistent  state. 
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IS  now  striclly  consistent.  If  the  set  of  positive  ground  facts  has  become  smaller,  this 
l  an  only  happen,  when  an  instance  of  an  atom  of  the  constraint's  body  has  been  deleted. 

Example  6  I^t  =  |s(Af )  v  r(X)  I  and  =  {s(a)  j.  After  the  user 

lias  deleted  q{a),  the  profile  is  strictly  consistent. 


0)) 


A  woakly  consistent  constraint  's  property  of  being  secrecy-preserving  is  not  ak 
i)v  delete  commands  since  tlie  sets  Ak,  once  they  are  established,  are  invariant 


V'  ifh  rt'speci  to  them. 


hxample  7  I x't  C,  =  {v{  X  j  v  .s-(  A  )  /  r(  .A )  -  i?(  A )},  /,  =  {^(c),  r{a)}  and  G,  --=  Af- 

t(  I  the  user  has  deleted  r{a],  the  profile  is  weakly  consistent.  The  candidates  for  a  se¬ 
cret  an'  v{a)  and  5(a).  This  situation  is  now  independent  of  any  other  delete  operations. 

'  I'mma  E  A  profile  Dli,  is  (i  1 -secrecy-preserving  for  RD  if: 

•  />/</,  preseiwes  Gl-s(H'recy  for, 

•  Any  iK'wIv  created  S('t  ol  candidates  for  a  secret  has  more  than  one  member. 


3,3  (il  and  the  right  to  insert 

i  a  t  / 1/|,  ii('  ( ,  1  -sei;recy-presen/ing  for  RD  and  =  ClAh  Ud  (p  e  A/^,;  we  assume  that 
i|i(  stale  c/A.  ;  I  consc'qiient  upon  the  acc('|ilanc(' oMlte command  contains 

69an(l  wi'  denote  by  /^the  st't  ol  atoms  that  have*  been  newly  included  in  E{1  p)  together 
vvitli  (P,  i.--  :  F[l])\F[I j. 

it  /  "  '(i  -  0,  theP  the  new  clause  does  not  iiroduce  any  new  facts  which  should  be 
kept  st'cret  Ironi  p.  Hence  th*'  data  do  not  disclose  any  secret.  This  is  always  the  case 
wIk'm  the  user  behaves  concordant  with  bis  trustworthiness. 


1  lo^vever.  th('  user  can,  eg  by  mistake  or  on  purpose,  insert  a  clause  so  that 

VV  ('  tak('  the  view  that  in  this  case  the  reaction  of  the  database  should  be  guided  by  the 
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t'xlerna]  assumptions,  ie  p  has  not  gained  knowledge  of  any  secret.  The  database  may 
!u>t  simply  enter  (pmto  RSp  since  this  means  admitting  that  the  secret  is  disclosed.  We 
believe  Ihe  database  should  assume  that  the  user  has  created  his  own  (t^and  it  mi:  : ;  :A 
a  way  lo  resolve  the  name  clash  which  has  now  arisen  in  the  global  name  space  of  ' 
ri.ita  m  /  One  possible  solution  is  the  globalisation  of  ^in  /by  qualifying  it  with  the 

nauK'  ()l  p.  Let  (p  =  s(Ci,. ,  ^ . ,  then  it  is  sufficient  to  qualify  the  head.  In  particular 

we  ran  eitht'i'  qualify  the  ])redical('  symbol  itselt,  ie  turn  5  into  s.p,  or  we  can  qualify  tijc 
lei  ms  ot  $,  i('  turn  s(c, , . . . , e,,, )  ^  .  into  .v(rj.  .  .,c^./i)  e-, . .  This  process  must  be  trans- 

part  ni  to/).  lh('  application  otthis  method  I'esults  in  two  different  clauses  ^and  (p.p:  (p 
I  ■-  us('d  in  del  ivations  initiated  by  persons  who  have  the  right  to  see  it,  and  (p.p  is  used 
as  0  111  j>  s  (k'rivations. 


Now  that  0.p  is  a  member  ot  does  it  make  sense  lo  make  it  visibk'  to  other  persons 
as  w('ll.  that  is,  can  we  treat  0.p  as  an  ordinaiy  element  of  /?  The  answer  should  be 
A  es’  I'he  basic  database  semantics  tell  us  that  0  ^  TRUE  ^  e  If  p  is  just  try¬ 

ing  l(,  luck  the  database,  then  he  will  issui'  a  delete  command  immediately  after  the  in¬ 
sert  command,  hoping  that  nobody  has  noticed  his  efforts.  Whereas  a  clause  which  is 
^iiie  lor  p  will  1  ('main  in  th<'  database  as  long  as  he  considers  it  true.  Puiely  by  chani'c 
Ibis  claus('  has  b('en  givt'ii  <1  conllicting  name.  'Lhiis  w('  think  it  right  to  include  //s  (/?as 
•  b' ;  /■  //A  . . .  into  /  and  tr'.'al  it  (with  resprci  lo  all  persons  ('xce])t /;)  as  an  ordi- 

nan  daiisi'.  TIk-  main  advanlagi'  ofthis  solution  is  that  it  intluencc's  nc'ither  the  seman 
lii  s  ol  /)/>  noi  ol  and  vv('  do  noi  ni'i'd  to  introduce'  notions  lik(':  7S  bvlicvc.d  by...,  is 
only  true  for...  ('tc.  Mon'ove'r  it  prc'st'n.'es  lh('  si'cn'cv  of  all  Ky. 

'Vc  now  turn  our  attention  to  IIk'  role  of  inl('grily  constraints  in  the  possible  state  transi 


lions. 


dbsec94.doc 


A  :  ,palka.  04.07.1994 


20 


DBp 

INSERT  (p 

db; 

1. 

strict 

accepted 

strict 

2. 

strict 

accepted 

weak 

3. 

strict 

rejected 

(no  change) 

4. 

weak 

accepted 

strict 

,5. 

w('ak 

accepted 

weak 

6. 

weak 

needed 

(no  change) 

i  1  )<  ‘  ast's  (3)  and  (6)  pose  no  risks,  since  C),  provides  the  reasons  for  the  rejection. 

V  iisr  (D  does  not  introdut'e  any  new  risks,  either.  In  case  (4),p  must  have  inserted  a 
link  of  a  K.  so  tliat  a  pn'vioiisly  w('ak  < onsistcait  integrity  constraint  is  now  satisfied. 

t  asf's  (2)  and  (5)  can  he  considered  together.  I^t  us  assume  that  a  constraint 
1  '  rri ' '  -vr/,,,  <  /?]  a.  .  .a/^  of  ( has  I^een  valid  in  Ip  hut  is  now,  due  to  some  new  ele¬ 
ments  of  ip,,  weakly  consisl(mt.  Thus  there  are  some  elements  in  which  have  made 
the  constraint  s  body  true  so  that  no  matching  instances  of  the  head  s  atoms  are  pres¬ 
ent  m  P'(  I'p ).  Thus  the  insertion  lias  created  a  new  Again,  yis  secrecy-preseiwing  if 
K  has  moH'  than  one  link. 

I'  xamplc  H  I .<'1  =  j.s(  X  )  v  ;'(  X )  <  qi  X )  a  v{  A  )|,  /,,  =  {^(a)}  and  =  |s(a)}.  The  in- 

tegritv  constraint  is  now  satisfied.  Aftei'  the-  us('r  enlta's  v[a),  it  is  weakly  consistent,  and 
sinc('  there  are  two  candidates  for  s('crels,  it  jireseiwes  (i  1-secrecy. 

l.rmma  .1  A  profile  DBp  is  ( i l-s('cr('cy-]ir('serving  for  RI  if; 

•  DBf,  preseiwes  secrf'cy  for  (11  and  RD 

•  Insertions  valid  with  respect  to  Cp  are  handled  along  the  presented  guide-lines. 

•  Any  newly  created  set  of  candidates  for  a  secret  has  more  than  one  member. 
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3.4  Final  remarks 

VVf  woiild  lik(*  jo  mention  lhal  we  have*  also  contemplated  the  use  of  a  special  term,  for 
■xainpio  a  icrn-  secret,  for  onr  puiposcs  However,  we  have  abandoned  this  idea  hfe 
eanse  is'-clisTunary  steps  h;n'e  alreadv  indicaled  lha!  we  are  goinp  to  run  into  troubles 
snmlai  lo  ihosi  oftheA^h/  /  vain*' 

i  iiially  e/e  noie  ihat  Gl-si'crecy  can  be  applifcd  io  mnlhlevel  secure  databases.  The  de- 
Uiils  of  a  ar'!  i)resented  in  Spalka  n994b). 

4  Conclusion 

A  l(;me  fialabase  has  seiwed  as  *he  slarling  poi/'J,  lor  llus  jiaper.  VVe  have'  shown  a  way 
1o  iuUl  persons  and  righls  lo  is,  So  dial  llu  s<'!nanlics  ol  this  datahast'  is  ])roperly  ex- 
lended;  ihe  nst'rs  have  l^ei'n  assunu'fl  fo  possess  Iheir  own  knowledge.  We  have  intro¬ 
duced  ions'  po‘<‘^ihl('  meaning'^  of  eonluk'nlialilv:  when  asked  about  a  confuk'niial  clause, 
ihe  fiaiabas.- .  .-ai  (inforniailv)  answei';  I  don  I  undessland’,  ‘1  don’t  know’,  ‘Ncs’  or 
'M;!vi)e  I  hi'ee  ol  (hc'sc  answ-ers  ars'  a|)|)h(  aJile  m  the  presence  of  (he  Closed  World  As- 
siin ipj K ae  i  ne  ionnal  version  ol  the  last  possibility,  (k'uoted  by  (il,  is  allowed  to  pro- 
'id>  Mil  us(  '  ^eiih  iiuk'llnile  intoi  niatiosi  oss  ,i  ses'sa't.  In  Ihe  following  we  have  concen- 
h  aled  'Ui  the  oiilorcement  of  ( i  i 

As  oil!  niain  !  ‘  --,nIt,  we  haw  Kk'ntifK'd  eonditions  wliieh  guarantee  til-secrecy  in  situa- 
iioiis  winae  th-  iisc'i'  has  the  I'lglil  to  submit  qiieris's,  dek'le  and  insert  commands.  The 
chspnguislu’fl  role  of  integrity  lonstraints  has  bi'i'u  given  careful  attention.  W(‘  have 
fk'monsiraled  llial  there  is  no  lundanumtal  conllict  bc'lween  security  and  integrity,  but 
instead  the  constraints  role  as  honnclary  conditions  has  become  apparent  throughout 
the  investigation.  The  pr('sent('d  formalism  is  well-founded,  and  has  been  completely 
exprt'ssed  wdthin  the  limits  of  standard  predicate  logic. 
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Our  next  steps  will  be: 

•  to  investii^ate  the  enforcement  of  G2-security  in  logic  databases  and  to  extend  j 
(o  a  multilevel  security  model 

•  lo  search  for  a  feasibl(\  [.n  actical  way  t(t  construct  user  profiles. 
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ABSTRACT 

We  have  developed  an  object  model  for  MLS  object-oriented  databases  with  per-element  access 
control  We  argue  that  an  MLS  object  mode)  should  specify  structures  and  operations 
supported  by  a  trusted  kernel  and  hence  should  be  kept  simple.  “Convenience”  operators  and 
enforceinent  of  noncntical  integnty  guidelines  are  provided  by  nonassured  layers  above  the 
kernel.  Also  for  simplicity,  there  is  a  single  kind  of  labeled  entity,  the  data  element;  these 
labels  determine  the  protection  for  several  other  types  of  constructs.  We  identify  assumntions 
in  some  poor  models  that  conflict  with  commercial  OODBMS  requirements  for  language 
compatibility,  and  performance.  Finally,  we  explore  some  conflicts  between  flexibility  of 
de  etion  and  avoidance  of  polyinstantiation.  Companion  papers  will  address  metadata 
polyinstantiation,  and  ordered  collections. 


1.  INTRODUCTION 

Obiect-oriented  database  management  systems  (OODBMSs)  are  gaining  popularity  due  to  their 
inherent  ability  to  represent  conceptual  entiues  as  objects,  parallehng  thfway  humans  viewle 
^  representation  has  led  to  a  new  generation  of  object  database  managers 

r^'ln/r applications  such  as  computer  aided  design  and  computer  aided  management 
(t  AD/CAM)  multimedia  information  processing,  artificial  inteUigence,  and  proce.ss  control 
Foi  suciire  apphcations,  this  paper  specifies  an  object  model  for  securing  multilevel  secure 
(MLS  I  information. 

1.1.  The  Need  For  A  New  Model 

presented  here  is  not  radically  different  from  other  models  that  label  individual 
elements.  However,  it  appears  that  no  published  model  provides  all  of  the  following; 

•  Model  FleMbility:  Some  models  impo,se  restrictions  to  rule  out  states  that  appear 
unreachable  or  unnecessary.  Paradoxically,  such  restrictions  may  comphcate  the 
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model  and  implementation.  a.s  they  mu.st  be  documented  and  enforced,  and  will  be 
von/  difficult  to  remove  later 

’  Element-level  Access  Control:  The  need  to  split  namral  problem  domain  objects  into 
objects  each  at  a  single  security  level  greatly  complicates  application-building  [Smit89 
(sec,  5  2),  Rose93].  Many  of  the  existing  models  provide  protection  only  on  coarse 
granules  {l  e.,  objects) 

-  Eonsisten(  v  with  Industry^  Trends'  Some  existing  models  make  method  calls  very 
expensive  »nrbid  multiple  inheritarirp,  require'  that  the  language  encapsulate 
individual  objects,  or  require  thal  the  DBMS  maintain  the  integntv  of  al!  stored 
mterpnees  These  decisions  are  inconsistent  with  the  languages  (e.g  .  (’++)  and 
tradeofts  common  in  the  OODBMS  mdustrv 

Isxpih.h  Oprt naon  Specifi(.atians  fev;  existing  models  define  their  operation  semantics 
in  dcLail,  especially  for  deletion  and  regrading.  Thus  it  is  difficult  to  judge  whether  an 
operation's  results  can  be  expressed  by  the  model's  structures,  or  whether  features 
that  impact  operations  at  different  levels,  such  as  object  existence  and  identity  can 
cause  channels. 

1  hi.s  paper  addresses  the  above  issues,  and  extracis  lessons  that  may  be  ol  use  with  other 
mudeLs  A  fuller  model  that  also  addresses  metadata,  polyinstantiation,  and  collections  fe  g 
sets,  lists,  trees)  is  presented  in  (Rose94i. 

1.2.  Paper  Overview 

1.2.1.  Role  of  a  Secure  Data  Model 

The  specification  of  an  MLS  DBMS  can  be  done  at  three  layers — the  kernel  interface, 
which  defines  the  allowable  structures  and  basic  irusled  operations,  guidelines  that  advise  ’ 
about  structures  that  should  be  viewed  with  skepticism,  and  the  application  programmer 
interface  twhich  includes  convenient  operators  buil!  above  the  kernel). 

The  model  presented  in  this  paper  defines  flic  kerne!  interface,  taking  care  to  minimize  its 
size  and  yet  provide  appropriate  flexibility  for  users  Kerne!  facilities  must  he  present  and 
assured  so  that  they  t:an  be  relied  on  in  implementing  other  features.  Kernel  operations  on  a 
legal  state  always  produce  another  legal  state,  even  if  pnvileged.  In  contrast,  integrity 
constraints  are  frequently  enforced  only  at  transaction  commit  lime.  We  spend  little  time  on  the 
application  intertace's  additional  ’convenience''  features,  leaving  them  to  DBMS  designers. 

Explicit  guidelines  are  warnings  about  database  states  that  seem  dubious  (e.g  .  dangling 
rclercnces.  inacce.ssibie  data,  and  odd  pattem.s  of  classification).  Since  such  states  do  not 
cxirapromise  .security,  their  detection  need  niM  be  assured.  In  fact,  the  model  is  not  violated  if 
DBMS  builders  omit  enforcement  of  some  guidelines,  or  if  an  application  overrides  warnings, 
especially  during  intermediate  steps  in  a  larger  process.  Some  integrity-related  euidehnes  are 
discus.sed  in  the  appendix. 


1.2.2.  Paper  Organization 

The  Uniform  Fine-grained  Object  Security  (pronounced  “U.F.O.s”)  family  of  models 
[Rose941  attempts  to  address  all  the  difficulties  mentioned  in  Section  1.1.  This  paper  presents 
a  limited  model  called  Single-valued  UFOS  (sv-UFOS),  which  explores  the  possibilities  and 
lim.itations  of  a  model  with  neither  polyinstantiation  nor  covert  channels. 

TTie  model  description  begins  in  Section  2  with  a  simple  nonsecure  object  model,  above 
which  richer  type  systems  could  be  built.  We  show  how  a  polyinstantiated  data  model  has 
different  natural  security  entities  from  one  that  forbids  polyinstantiation.  Section  3  discusses 
‘^cciinty  issues  in  more  detail  and  illustrates  how  various  situations  would  be  modeled.  Section 
4  presents  operator  semantics  (including  the  handling  of  security  levels)  and  examines 
altemative  deletion  operations.  Section  5  covers  design  goals  and  implementation  issues.  And 
iinally  Section  6  addresses  future  work  and  conclusions.  In  addition,  the  appendix  discusses 
integrity  enforcement  tradeoffs. 

1.2,3  (  omparison  with  Previous  Work 

Several  MLS  object  models  have  been  previously  published.  Some  of  them  assume  that 
an  ebject  contains  information  at  only  a  single  level,  thereby  forcing  application  programmers 
to  decompose  the  application's  natural  conceptual  objects  [Keef89,  Jajo90,  Mill92,  Bert94J. 
This  decomposition  is  very  onerous  and  single-level  restrictions  would  be  difficult  to  remove 
from  a  DBMS  once  implemented,  so  we  consider  the  such  restrictions  unsuitable  for  the  long 
term  [Smit89.  Ro.se93]. 

Sv-UFOS  is  closer  to  published  object  models  that  permit  multilevel  objects,  such  as 
fGajn88,  TTiur91,  Smit89,  Morg90].*  Unlike  some  relational  work  (especially  SeaView, 
fLunt94]).  most  of  these  MLS  object  models  define  operators’  semantics  informally;  specifying 
sv-UFOS  operators  in  detail  caused  us  to  find  and  fix  many  problems.  Also,  some  of  the 
above  models  include  many  constraints  on  entities’  relative  levels.  The  costs  of  such 
constraints  are  discussed  in  Section  5. 

We  protect  .several  of  the  varieties  of  information  identified  in  [Smit89],  including 
associations  to  values,  associations  to  abstract  objects,  values,  objects,  and  labels.  Section  3.2 
illustrates  how  this  is  achieved  by  means  of  controls  on  read  and  write  access  to  a  single 
construct,  the  element.  (Our  operational  semantics  do  make  special  provision  for  the  existence 
and  labels;  however,  we  benefit  in  that  operations  defined  on  ordinary  attributes  can  also 
manipulate  existence  infoimation. ) 

Several  models  inspired  by  Smalltalk  exploit  object  encapsulation  and  assume  that  private 
attributes  of  an  object  can  be  accessed  only  by  methods  called  on  that  object  f  Jajo90.  Bert94, 
Oliv041  Such  results  appear  very  difficult  to  transfer  to  the  mainstream  of  the  OODBMS 
industry,  where  C-n-  is  the  predominant  language.  Three  major  difficulties  are  examined 
below 


lBi.sk88]  permits  objects  whose  elements  have  different  access  classes,  in  aDAC  setting. 
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h!  si  .  and  most  seriously,  C++-  encapsulates  classes  rather  than  obiects:  that  is  a  method 
on  a,  (  ++ oh,ect  o  can  access  the  pnvate  state  of  any  ob)ect  in  class{o)i  Second,  some  models 
assume  ihat  methods  can  wnte  only  in  ways  mediated  by  the  OODBMS  but  method  code  is 
usually  allowed  to  invoke  any  system  capability  Finally,  for  high-assurance  systems, 

ventying  that  a  language’s  encapsulation  is  enforced  may  require  assuring  a  substantial  part  of 
tnc  l  ompiicr  ^ 


:  THF.  UNDERLYING  OB.fECT  MODEL 

a  simple,  conventional  object  model  which  we  subsequently  extend  to  an 
Ml . .  objv:(.t  model  We  also  examine  two  candidates  for  the  basic  secunc  cntity-^element  (a 
pan  <ob|ect  instance,  attributoi  and  assnciannn  fa  pair.  <element,  value>).^ 

An  nhp>ct  instance  (or  jusi  ohjc'-f)  consist  of  a  state  (stored  data)  that  associates  a  value 
w^iHi  eacn  attnbufe.  and  a  set  ol  defined  operations  or  methods  that  act  upon  that  state. 

L  Methods  are  disimssed  only  bneflY  in  this  report';  Objects  are  instantiated  from  a  class 
delimtion  u,  wdneh  all  of  an  objeci  s.  attnbuies  and  methods  are  defined;  an  element’s  value 
mus,  he  s  lass  appropnate.  An  nbiect  will  often  be  denoted  using  an  optional  prefix  (“a”  or 

^  ^  human-understandable  instance  name,  as  in  aShip.  theNimitz.  or 

Objei^  arc  an  abstraction :  user  programs  and  1 JFOS  model  operations  refer  directly  only 
io.  mines  There  are  two  kinds  ol  values:  primitives,  such  as  integers  and  strings,  and  object 
lotten  referred  to  as  handlesf  Given  a  handle,  the  DBMS  can  test  if  it  references  an 
object  cunently  in  the  database,  and  if  so  can  access  that  object.  To  avoid  serious  impact  on 
perlormance  and  language  compatibility  fas  discu.ssed  in  the  appendix),  referential  integrity  is 
merely  a  guideline.  It  o  denotes  an  object,  @n  denotes  a  value  tliat  can  be  used  as  a  handle  for 

f-i!  v'.  n  an  obiec!  o  and  allrihute  A.  o.A  denotes  the  corresponding  element  An 
associauon  assigns  a  value  to  a  data  element,  the  value  of  the  association.  For  example, 

might  be  associated  respectiveiv  with  the  integer  40 
ano  inc  nanclle  vP Singapore .  ^ 

Several  technicalities  ,should  be  noted.  First,  an  elemeni’s  value  and  an  operation’s 
argumenls  arc  not  objects;  only  such  values  are  pas,sed  to  user  programs.  Second,  the  current 
aatahase  state  has  at  most  one  association,  and  hence  one  value,  for  each  element.  Thus  we 
ean  use  .  >,.4  le-  refer  ti'i  either  an  element  or  the  association,  if  any,  from  that  element.  Third 
there  is  mr  requirement  for  a  unique  object  identifier  that  is  shared  across  security  levels— at’the 
implementer  s  option,  an  objeci  may  have  any  number  of  handles.  Finally,  the  same  primitive 

( <1e,s,uncrs  have  traditionally  empha.s,;.ed  software  engineering  and  performance  rather  than  security. 

Both  the  object-onented  and  seciintv  comiiuinit.es  u.se  the  term  object.  To  avoid  confusion,  this  report  uses 

)t  cmi  .n  Uie  tirsi  sense,  a.s  an  obieci  possessing  attributes  and  metiiods.  The  units  of  security  are  referred 
t(i  as  entities 
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nr  handle  may  be  the  value  of  many  elements,  potentially  requiring  different  security 
protections.  For  example,  @Beijing  may  be  the  value  associated  with  both 
KissmgerTnp.Destination  and  China.Capital. 

Operations  create  and  delete  objects  and  associations  to  values.  For  purposes  of  the 
abstract  model,  values  are  never  created,  deleted,  or  changed;  conceptuallv,  all  possible  valu-^ 
i  integers,  strings,  handles)  exist  when  the  database  is  created.  The  implementation  must 
support  this  illusion  without  violating  security.  The  link  between  a  handle  and  an  object’s 
physical  storage  is  maintained  by  the  DBMS,  transparent  to  the  user. 

2,1.  Basic  Security  Entities 

In  sv  i  iFOS  the  basic  secunty  entity  is  the  element;  in  polyinstantiated-UFOS  (p-UFOS' 
It  IS  the  association  rRose94].  The  rationale  for  the  different  choices  is  discussed  here 

One  way  to  understand  a  conventional  database  is  that  each  element  identifies  a  property 
ot  some  real-world  object,  and  each  association  asserts  the  current  state  of  such  a  property.  An 
association  i.s  thus  the  basic  unit  of  fact.  A  common  notion  of  polyinstantiated  database  is  to 
allow  a  separate  assertion  about  such  a  property  at  each  security  level;  some  levels  may  make 
no  assertion  When  associations  are  the  protected  entities,  a  user  at  level  L  knows  nothing 
about  associations  at  any  level  L  dominating  or  incomparable  to  L.  An  update  to  an  element 
that  already  has  an  association  at  L'  cannot  be  refused  without  creating  a  covert  channel. 

To  avoid  polyinstantiation,  we  associate  a  security  level  with  each  element,  to  govern  the 
sen.sitivity'  ot  the  association  it  holds.  Elements’  security  levels  are  readable  by  any  user  whose 
evel  that  can  .see  that  the  object  exists;  thus  each  update  operation  can  test  whether  an  update  is 
permitted,  even  if  it  cannot  see  the  element’s  association  and  value.  The  model  includes 
operations  to  change  an  element’s  level. 

Neither  values  nor  elements’  levels  are  “first  class”  security  entities.  'Values  always  exist 
m  principle:  the  sensitivity  is  attached  to  their  usage,  as  shown  above  for  ©Beijing  Elements 
are  rather  closer  to  being  entities,  but  they  do  not  require  their  own  labels. 


3.  STRUCTURES,  CLASSIFICATION,  AND  SUBJECT  LEVELS 

This  section  defines  the  constructs  of  the  secure  object  model.  Section  3. 1  describes  the 
security  entities  of  the  model,  and  their  usage  is  illustrated  in  Section  3.2.  Section  3  3  sketches 
the  model  s  operations  (which  are  presented  in  detail  in  Section  4). 

3.1  Security  Levels  of  Elements 

Secunty  entities  are  those  for  which  access  classes  can  be  individually  assigned.  To 
simplify  the  theory  and  (we  hope)  the  DBMS  implementation,  the  only  kind  of  security  entity  is 
the  element.  The  sensitivity  of  each  element  o.A  is  indicated  by  a  classification,  called 
levelio.A).  I.evels  typically  include  hierarchical  and  nonhierarchical  components  and  are 
partially  ordered.  Throughout  the  discussion,  the  level  of  the  current  user  program  is  denoted 

1 V  .V  ■ 

Object  existence  is  a  security  entity,  as  in  other  models,  and  operations  need  to  test 
whether  an  object  s  existence  level  is  dominated  by  Ls.  Rather  than  define  a  completely  new 
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construct,  we  represent  existence  using  an  additional  Boolean  attribute,  exist,  in  each 
ohiect.  Most  kernel  and  higher  iovel  features  applicable  to  ordinary  elements  also  apply  to 
,<i,>tLnct  These  leatures  include  labels,  t]uerc  languages,  auditing  of  updates,  triggers,  and 
tin  p-lJFOS  s  polyinstantiation  to  reflect  disputes  about  an  object’s  existence  We  impose  the 
constraint  that  for  every  element  o.A,  level(o.A)  >  leveHo  exist).  Element  levels  may  be  stored 
in  each  obiect.  along  with  exist.  AitemativeJv  if  all  objects  in  a  class  have  the  same  level  for 
‘■ornn  attribute  the  schema  can  identify  that  level. 

A‘..  m  MLS  relational  DBMSs,  a  secunf  v  subjeci  corresponds  to  a  process  or  executing 
poc-rrani  which  is  as.sumed  to  be  single  ievf  ’i  Data  residing  in  a  process'.s  m.emory  is  part  of 
tot  n,v'-i,..;>.s.s  Method  invocations  and  aitnbuie  references  execute  as  part  of  a  program,  like 
ordinan-  procedure  calls;  they  do  not  came  creation  of  a  new  subject.5  The  OODBMS  permits 
access  ti  the  database  only  for  the  operations  that  are  explicitly  in  the  secure  object  model. 

Our  security  policy  for  whether  an  impnvijeged  subject  can  read  or  modify  a  particular 
element  is  in  the  tradition  of  Bell  and  f  .aPaduia  An  unprivileged  subject  at  Ly  can  retrieve  the 
associated  value  of  elements  at  nr  below  .Pc  and  can  modify  element  values  at  or  above  The 
ie'/els  0!  elements  of  the  form  o.A  are  protected  as  d  they  were  part  of  existence  information  at 
ievci[o  ewA,  and  turthermore  mav  be  updated  only  by  special  operations  in  the  model 
!  Section  4  .T  ,' 


3  2.  Representing  an  MLS  Database  in  UFOS:  Examples  and  Discussion 

We  now  illustrate  how  the  model  , s  access  controls  for  elements  seiwe  to  protect  other 
information  n!  concern,  such  as  existence,  objec!  idcnliliers  i  handles),  and  the  fact  that  a  value 
appears  in  the  database.  Any  level  that  dominates  leveho. exist)  may  appear  on  any  element. 

Figure  3  I  show/s  two  instances  oi  the  class  Prrson,  p2  •?  and  p78,  and  one  of  class 
Department,  dl2.i  'fhe  existence  of  objec!  p2.^  is  unclassified,  as  is  p2d.p_name.  The 
assiiciations  Irom  p2d  address  u.  'Times  Su  and  ot  p22. department  to  the  handle  @dl23  are 
Secret,  p/h  has  a  completely  ditterent  pattern  o!  security  levels.  Existence  of  p78  is 
conlidential.  and  handles  oip78  will  appear  invalid  to  unclassified  operations.  In  this 
example,  both  Person  records  retcrencf'  the  same  department,  using  associations  to  same 
handle  fc)  J/2-f  However,  p78  v  association  with  the  department  is  less  .sensitive. 
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i  ixfate  opi’rntor'.  iv’have  different!'-  on  cost  that;  on  o 
ohji'c!  orieniPf!  laniniage.  Existeti'a  FJemcm  car.  be  ..t 


ther  (',l('nictU'-.  It  tlie  OODBMS  i.s  implemented  in  an 
•  tibtvpc  of  type  Element.  Existence.  Element  can 


miieri!  most  operations  from  Elenmni.  but  some  o  Modify)  will  be  overridden 


Because  obiect  manager-s  often  handie  Nith  mam-memorv  iiid  (disk-based)  persistent  objects  transparently 
pC! li'-mtanec  could  sutler  if  all  metliod  calls  were  significant  operating  system  or  security  events.  We  are 
not  awaic  i-l  anv  i.ommercial  obiect  oriented  programming  language  or  DBMS  that  implements  each  object 
or  each  method  call  as  a  separate  process.  Such  implementations  .seem  appropriate  only  in  an  object  model 
dei  ot!  d  to  large-granule  distributed  computing  Ftven  for  Smalltalk,  whose  conceptual  model  consists  of 
messagiis.  the  implementation  typically  approximates  late  binding  for  procedure  calls. 
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p23 


B 

p78 

Bi 

p_name 

address 

birthdate 

department 

exist 

"Max" 

"Old  Town" 

"1/1/71" 

@d123 

<c> - 

true 

d123 

ci  name 

<|J>  — ■ 

phone# 

budget 

exist 

"Special  Ops" 

"555-1234" 

<ts>  — — — 

10,000,000 

<u> - 

true 

Figure  3-1  Objects  with  Labeled  Elements 


3.3.  Operations  in  the  Model:  Overview  and  Illustrations 

This  section  sketches  the  model’s  operations  and  discusses  the  kinds  of  information  that 
are  protected  from  access. 

Operations  in  the  UFOS  model  are  procedures,  callable  from  either  methods  or  ordinary 
application  programs.  As  procedures,  the  model’s  operations  have  input  and  output  parameters 
that  are  bound  to  arguments  in  the  calling  program  (i.e.,  to  program  variables  that  store  or 
receive  values).  Issues  of  name  scope  and  encapsulation  would  be  part  of  the  programming 
language  but  are  not  considered  part  of  the  secure  object  model;  security  of  the  system  does  not 
rely  on  compiler  enforcement  of  object  encapsulation. 

The  model  s  operations  for  accessing  elements  are  retrieve  and  modify,  and  (expressed  as 
a  reference  to  the  exist  attribute)  effectively __exists.  The  operations  on  whole  objects  are  create 
and  delete.  Operations  to  raise  and  lower  element  levels  are  also  provided.  Like  the  majority  of 
commercial  OODBMSs,  we  provide  no  explicit  operations  or  protection  for  k-ary  relationships 
except  as  ordinary  objects.  We  now  illustrate  some  operations  to  show  how  thev  protect 
vanous  forms  of  information 

Retrieval  oi p2 3. department  returns  the  handle  @dl23.  The  association  to  the  handle  is 
protected  at  the  Secret  level,  thus  protecting  the  fact  that  Bill  works  in  dl23,  and  also  protecting 
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irmirmaiion  that  the  joiplementetion  might  store  in  the  handle/’  The  mode!  does  not  release 
'biei-t  identifiers  to  user.  Thus  far.  the  user  has  access  only  to  the  handle;  neither  the  object 
identiher  the^ohiect  itself,  nor  any  of  the  object’s  element  values  has  been  retrieved.  To  access 
the  referenced  departments  element  values,  one  invokes  retrieve  with  @dl23  as  an  argument’ 
the  new  invocation  is  again  subject  to  the  .security  policy.  The  only  way  to  determine  that  a 

'sdue  such  as  555-1234’’  or  @dI23  appears  in  the  database  is  by  retrieving  it  as  the  value  of 
■in  'iemens 

!  hr  i  xampie  schema  illustrates  that  liFOS  permits  ob)ecLs  that  are  visible  but  unreachable 
r  ■  nr-  scivinty  lever  For  example.  dl2.i  has  Unclassified  d  name,  but  there  is  no 
■  n-  ^assiiieci  re-tcrence  to  dl23  Tnis  situation  can  an.se  in  several  ways,,  e.g.,  through  deletion 
01  a  rrkrence  or  through  Regrade  As  in  single-lcvei  databases,  an  object  can  become  totally 
un.eachaole  Like  uiost  programming  language, s.  wc  handle  such  unreachable  objects  via 
gar.  •age  ...(.iliection  rather  than  hv  constraints  enforced  on  each  operation.  Tradeoffs  for  integritv 
enfiircement  are  discussed  further  in  the  appendix 


4,  OPERATION.S  OF  SV-llFOS 

fnis  seaion  rtesenbes  the  operations  ol  the  single  valued  f  JFOS  model  and  illustrates  their  use 
I  he  model  includes  operations  on  a  single  obicct,  on  the  association  of  a  .single  element,  and  on 
the  level  (It  a  single  element.  The  operations  appear  not  to  have  channels,  but  we  have  not 
given  a  formal  proof  Section  4.4  discusses  allernatives  to  the  potentiaUy  clumsy  deletion 

operation;  we  show  that  they  require  relaxing  the  assumption  that  every  obiect  has  a  unique 
lowest  level  ^  i 

Fhe  operation  interfaces  were  chosen  lor  clarity  m  expressing  the  model;  the  application 
program  interface  can  provide  a  laver  nl  syntactic  sugar.  For  example,  to  be  compatible  with 
egaev  code  ^at  expects  only  a  value,  other  output  parameters  (e.g.,  return  code,  element’s 
eve.  might  he  letumcd  by  separate  calls.  This  syntactic  layer  might  be  different  for  each 
piogramming  language  from  which  the  OODBMS  can  be  invoked  and  is  not  specified  here 
For  each  operation  ,  give  a  shorthand  description  in  feiTPs  of  objects,  define  the  interface 
ann  men  give  the  .semantics  (including  return  .odes  to  indicate  the  situation  encountered) 

If  /?  denotes  a  handle.  Th  will  deniMr  the  object  referenced  by  h.  As  a  convention,  we 
consider^/?  to  be  a  way  to  identifv  the  real'’  iiperand  ol  the  operation,  an  object  denoted  o  A 
reaum  aide  invalid  handle  indicates  that  the  handle  rcterences  no  object,  or  if  the  referenced 
ohiect  d(H's  not  exist  at  orbelov,' 

Operations  cannot  assume  that  a  handle  presented  a,s  an  argument  is  legitimate,  or  that  a 
usi.i  who  present, s  a  handle  is  authon/ed  !o  access  the  referenced  object.  Arguments  come 
rum  iintnisted  user  code,  which  may  generalc  them  .randomly  and  maliciously,  or  may  hide 
o  as  a  String  I  a  iow  handle  to  an  objcci  that  has  been  upgraded.  In  the  operations  below, 

Handies  art.  vaJijts  that  may  be  pa.ssed  to  application  ciKle.  and  one  mii.st  assume  tliat  users  have  access  to 
■d!  intormation  m  the  stored  handle  such  as  tin  some  implementauons)  the  fact  tliat  dl23  i.s  an  instance  not 
o!  eias,s  Department,  but  of  die  .subclass  Covert  Ops  Dept 
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the  Boolean  function  effectively _exists{h)  returns  true  iff  Th  denotes  an  object  o  in  the  current 
database,  and  o. exists  is  dominated  by  Ls.  It  may  be  useful  to  consider  this  as  an  additional 
model  function. 

4.1.  Accessing  Elements:  Retrieve,  Modify 

Attribute  values  are  retrieved  or  modified  only  by  the  model’s  operations  on  elements. 

Retrieve  Traverses  the  association  to  return  the  element’s  value;  also  returns  the 
element's  level 

Interface: 

Inputs:  h:  Handle,  A:  Attribute 

/*  The  object  referenced  by  h  must  have  an  attribute  A  in  its  class 
definition.  The  output  value’s  type  is  determined  by  A  *! 

Outputs;  val:  Value,  L;  Level 

Semantics: 

If  Ih  does  not  effectively  exist,  return  null  and  the  return  code  invalidjiandle. 

If  there  is  no  association  from  o.A,  or  that  association  is  not  dominated  by  Lg, 
return  val  =  null  and  the  element’s  level. 

Otherwise  return  the  value  and  level  for  o.A. 

Modify  o.A  =  newValue:  The  value  of  element  o.A  is  changed. 

Interface: 

Inputs:  h:  Handle,  A:  Attribute,  newValue:  Value 

Outputs:  none  (except  return  codes) 

Semantics: 

It'  o=  Th  does  not  effectively  exist  then  retum(code  =  invalidjiandle). 

If  level(o.A  )  is  not  Lg  then  return  (code  =  wrong  Jevel) 
assign  newValue  to  o.A 

E.xample: 

Consider  p78.Address  in  Figure  3.1.  A  confidential  or  secret  request  will  read  the 
confidential  information  that  level{p7H.  Address)  =  “ts”  and  return  wrong  Jevel, 
thereby  avoiding  both  polyinstantiation  and  the  familiar  covert  channel.  A  top  secret 
modify  will  succeed,  while  Unclassified  will  return  invalidjandle. 
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4.2  Object  Operations:  Create,  Delete,  Test_Identity 

Create  f  Yeate  a  new  object 
Interface: 


Inputs:  c;  handle  of  a  class 

Outputs.  h:  a  handle  tor  the  newly-created  object  o 

Scinanlu  s: 


"T  indicated  class.  Initialize  even/  element’s 

•eve!  to  Ls  Assign  true  to  o.exm  Return  the  ob|ect’s  handle 

MeP-  dd'ficult  operation  with  multilevel  objects,  because  an  object’s  existence  snans 
flei.bilto"''  P““iMity  of  greater 


Delete_up  Deletes  the  object’s  information  at  Ls  or  higher  levels. 
Intpifare' 


Inputs  h:  handle  for  an  object  (denoted  oi 
Outputs;  none  (except  return'  codes) 

Semantics 


l  Otherwise,  destroy  all 

assouauons  at  Ls  or  higher  from  attnbutes  of  o  (i.e  set  these  attnbute  v^ues  to 

^  h' 'f  effectively  exists  atL^,  the  object  is 

considered  deleted,  and  h  v./i!l  no  longer  be  a  valid  handle. 

unprivileged  operation  Test  Identity,  to  test 

whether  two  handles  reference  the  same  object. 

4  3  Operations  that  Manipulate  an  Element’s  Level 

nnn-  1 hence  for  its  association)  When 
■  ^  ‘h  they  both  must  run  at  !cvel(f).e.r;.vri.  (At  the  end  of  the  section  we  discuss 

methluf  S  of  a  hypothetical  assignment 

method  s,  tjevelih,  4.  newjevel)  that  also  mns  only  at  level(o.exist). 


Redurei^^vel  oA  Effectively  re, sets  the  level  of  an  element  o  .4 
an  association  provided  at  a  higher  level 


to  its  lowest  value,  discarding 
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Interface: 


Inputs;  h:  Handle,  A:  Attribute 

Outputs:  none  (except  return  code) 

Semantics: 

If  ( Th  does  not  denote  an  object  o  such  that  leveHo.  exist)  =  Lg)  or 

(A  is  not  an  attribute  of  o)  then  return  //with  an  appropriate  error  code 
Assign  o.A  -  null;  U  destroy  existing  value,  to  avoid  possible  information  flow 
Set  Jevelih,  A,  Lg) 

Regrade  o.A:  Changes  the  level  of  an  element  without  losing  its  association.  Any  regrade 
except  an  upgrade  issued  at  level{o. exist)  requires  privilege. 

Interface: 

Inputs;  h:  Handle,  A:  Attribute,  newLevel:  Level 
Outputs:  none  (except  return  code) 

Semantics: 

If  h  does  not  denote  an  object  for  which  level{o. exist)  =  Lg. 
or  A  is  not  an  attribute  of  o  or 

{newLevel  does  not  dominate  level{o.A)  and  request  is  not  privileged) 
then  error_retum 

Otherwise  Set_Level{h.  A,  newlevel) 

We  now  sketch  two  ways  to  gain  more  flexibility.  One  approach  is  to  require  that  o.A  be 
classified  to  match  its  contents,  e.g.,  it  would  be  Secret  that  level{o.A)  is  Secret,  TS<A>  that  it 
TS<A>,  etc  This  permits  tighter  protection  of  levels,  and  allows  unprivileged  upgrade  by 
users  who  are  above  lev el{o.  exist)  but  dominated  by  level{o.A).  A  user  who  is  refused  access 
to  leveKo.A)  must  assume  that  the  level  has  been  upgraded  to  a  higher  level.  This  does  not 
appear  to  be  a  channel,  when  upgrades  are  initiated  from  below.  Since  these  two  approaches 
are  so  similar,  it  might  be  feasible  to  provide  a  single  implementation  that  could  be  installed 
according  to  either  convention. 

A  turther  generalization  would  be  to  make  levels  into  ordinary  labeled  entities.  However,  that 
would  necessitate  storing  an  additional  label;  in  both  the  previous  cases  the  sensitivity  of  ’ 
leveKo.A)  was  determined  by  ordinary  elements’  labels  (respectively  o.exist  or  o.A). 

4.4  Alternatives  to  the  Delete  Operation 

To  compare  sv-UFOS  with  a  DBMS  that  stores  single-level  objects,  observe  that  an 
application  object  (e.g.,  aShip)  that  contains  data  at  multiple  levels  would  be  represented  by 
multiple  DBMS  objects;  to  delete  it,  the  user  must  issue  a  delete  command  for  each  relevant 
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!  vrU  ’r  ^  separate  logins  DeSeung  an  entire  object  is  thus  significantly  easier  in  sv- 

ffiOb  li  IS  iit  little  help  however  in  deleting  selected  portions  of  an  object.  This  section 
mvestiptcs  alternatives  that  cater  to  partial  deletion  The  tentative  conclusion  is  negative-  -the 
generalization  pays  much  of  the  implementation  pace  of  polyinstantiation,  while  providing 
much  less  expressive  power.  '  * 

We  consider  the  altemativr  operation,,  deiere^  p.tjevel.  which  sets  elements  at  Lc  to  null 
H  HhoMi  changing  other  levels  When  o-s  o/Co  rxist)  there  is  no  problem.  When 
.  V  -  levpiin  exist),  exist  must  somehnv,  he  rfxrued  and  stored  at  the  higher  levels  It  mav 
also  be  desirable  rescue  other  oiements  irf  an  object  about  to  be  deleted  (e  g 
a  dun  f  aptaw)  f  all  the  minimal  leveh.s  *  above  4  at  which  the  object  has  associations  the 
rescue  levehs) 

We  treat  the.se  difficulties  as  manilestation.s  ot  a  more  general  issue  that  also  arises  in 
nonsecure  databases-  the  conflict  between  user  autonomy  and  data  shanng.  One  group  (e.g. 
!o\v  )  mav  provide  data  to  another  (e  g.,  higl;  r  buj  retain  the  right  to  update  or  delete 
mtormarion  without  permission  from  the  con.sumers  The  issue  has  been  explored  in 
c  nnj.maion  with  retercntial  integntv  iMorg90.,  Qian^d].  but  can  arise  even  among  elements  of 
■i  sinij,  jeci  The  .solution  does  not  he  in  changing  the  semantics  of  basic  operations;  rather, 

-  insump!  muJii  create  a  pnvaJe  copv  ol  required  mtormation,  possibly  when  first  reading  it 
pixssibn  Hi  the.  last  moment  using  a  trigger  .in  modip:  or  delete,  or  possibly  by  versioning  "  ’ 
^h’l^tejn_j.eve!  is  an  operation  of  ihi.s  .sort  When  there  is  a  unique  rescue  level  the 
ope-ation  semantics  appear  straightJorward  ;Thc  number  ot  rescue  levels  is  known  at 
Ue^,xo.exist)).  All  rescued  associations  are  regraded  to  the  rescue  level;  this  rescued 

includes  eMst  =  tnie  and  (if  specified  in  application-defined  tnggers)  useful 
c  PP  n  ation  elements  that  were  at  Lv  For  this  case,  wo  can  relax  a  restriction  in  [Morg901  and 
allow  the  rescuer  to  modify  these  associations  at  their  new  level. 7 

The  case  where  there  are  incomparable  rescue  levels  appears  to  require  weak 
po.vmstanmtton.  in  which  the. same  assertion  made  at  multiple  securitv  levels  fin  contrast  to 
normal  polvinstontration  in  which  a.ssertions  made  at  different  levels  may  differ).  Consider  an 
o  leci  o-g  up  which  exists  at  Unclassified,  but  has  additional  attributes  at  two  incomparable 
'.Jassi.ied  levels  <A>  and  <B>  and  at  their  upper  b-umd  <AB>.  What  is  to  be  the  result  of  a 
dele  e  at  the  unc  assitied  level?  i  e  be  faithful  to  the  outside  world,  the  result  should  have  a 
single  objeci  aShip  that  holds  intormatinn  in  compartments  <A>  ,  <B>,  and  <AB> 
cieicnce.s  (o  aShip  should  coniinue  to  be  usable  by  processes  at  these  levels.  Note  that  one 
canno!  speak  unambiguously  abou!  the  level  of  rescued  elements. 

Weak  p<iiyiiistantiafion  causes  operational  difficulties.  There  appears  to  be  no  way  to 
moditv  rescueo  attributes  or  to  delete  aShip  from  one  compartment  withom  causing  a  channel 
between  compartments  <A>  and  --.Bv  f  We  believe  that  polymsUmtiation  (of  associations)  ' 
i-oo.idcs  a  more  .lexible  and  cleaner  wav  ol  .senarating  actions  at  different  levels  |Rose941. 

Miicc  II  doe.s  no!  pnnade  weak  polvmsiantiation  of  existence,  tlie  model  in  [Morg90|  appears  to  require 
creation  of  a  separate  object  for  ettch  rescue  level 


I!  mav  be  possible  to  confine  weak  polyinstantiation  t,o  exi.u  To  do  so,  one  must  relax  Uie  model’s 
constraint  that  a.Ship.  Captain  must  dominate  aShip.e.xi.st:  permitUng  this  dubious  .state  does  not  appear 
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5  DESIGN  GOALS  AND  IMPLEMENTATION  ISSUES 
5.1.  Design  Goals 

Our  design  process  may  have  been  unusual  in  the  attention  given  to  the  goals  of 
simplicit'  and  robustness.  These  goals  led  to  an  effort  to  minimize  restrictions  within  the 
kernel  s  states  and  operations.  We  therefore  proposed  that  the  model  be  specified  separately 
trom  application  programmer  aids  and  integrity-promoting  guidelines. 

Model  restnctinns  impose  burdens  on  the  DBMS’s  specifiers  (to  document  and  explain 
them  ,,  on  DBMS  mplementers  (to  enforce  them),  on  security’  evaluators  (to  evaluate  the 
enforcement),  on  DBMS  maintainers  (if  restrictions  are  altered),  and  especially  on  users  (to  use 
them  or  work  around  them). 

To  make  the  model  simpler,  we  minimize  the  number  of  top-level  concepts  and  avoid 
explicit  restrictions.  Existence  is  treated  as  a  specialization  of  Element.  Protection  of 
existence,  values,  and  object  identifiers  is  built  into  operation  semantics  but  does  not  require 
additional  labeling.  Moreover,  we  impose  Just  one  constraint  on  label  assignments — tliat  each 
element  dominates  exist  (and  even  this  can  be  relaxed).  Other  models  vary  in  these  respects. 
Ru-  example,  due  to  use  of  object  names,  (Gajn88!  uses  seven  inequalities  on  levels.  Several 
models  make  difficult  objects  read-only  Metadata  management  is  another  area  where  we 
believe  many  inequalities  should  be  relegated  to  guidelines  outside  theTCB. 

Flexibility  and  robustness  were  also  important  design  goals.  Once  the  model  imposes  a 
restriction,  it  becomes  difficult  to  relax  because  it  may  be  assumed  in  code  throughout  the 
DBMS.  Furthermore,  so  few  applications  have  been  built  over  MLS  DBMSs  that  we  cannot 
confidently  declare  any  restriction  harmless.  Arguments  that  there  is  no  utility  to  a  particular 
database  state  often  tacitly  make  assumptions  that  ought  to  be  debated,  for  instance  that  states 
reachable  only  through  privileged  operations  should  be  illegal. 

A  more  conservative  and  robust  course  is  to  specify  a  general  model,  determine  an 
implementation  approach,  and  then  accept  only  those  restrictions  that  greatly  simplify  the 
implementation.  Restrictions  motivated  by  needs  of  the  formal  model  rather  than  of  users 
deserve  great  suspicion. 

5.2.  Implementation  in  a  Client-Server  Environment 

As  a  model  that  labels  individual  elements.  .Sv-UFOS’s  data  structures  do  not  present 
new  challenges.  Conventional  approaches  include  storing  each  object  as  a  single  multilevel 
record,  nr  splitting  it  into  multiple  single  level  records. 

OODBMSs  do  raise  one  important  kind  of  design  issue;  How  to  pass  information 
between  the  database  (typically  on  a  server)  and  the  application  (typically  on  a  client).  In 
relational  systems,  requests  affect  sets  of  tuples,  which  are  then  passed  sequentially  to  or  from 
the  client.  The  client-side  software  is  relatively  simple.  In  OODBMSs  the  passed  information 


liarmtui  The  rescue  can  then  leave  aShip.  Captain  unchanged  (i.e..  Unclassified).  The  operations’  semantics 
still  allow  ordinary  elements  to  be  read  only  at  levels  that  dominate  exist,  and  will  prevent  updates. 
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JS  ofrcri  inurh  closer  to  the  representation  of  data  in  storage,  objects  or  pages,  OODBMS  client 
software  maps  from  this  direct  representation  and.  upon  request  from  a  user  program,  directly 
retneves  or  updates  the  desired  elements  Eventually  the  object  or  page  is  written  back  to 
permanent  storage 

The  above  scheme  gives  very  good  performance  on  OODBMS  workloads,  but  raises 
senous  securitv  issues.  In  particular,  if  the  client  uins  on  a  single-level  or  low  assurance 
machine  anv  intormation  passed  to  the  client  must  be  considered  revealed,  and  any  multilevel 
oh]ect  nr  page  retumed  from  the  client  must  be  considered  corrupted.  We  are  currently 
in'  ,  .stigating  the  ilegree  to  which  various  DBMSs  mformation-passing  mechanisms  can  be 
protected  from  these  ■'oilnerabinties 


6.  C'ONCLUSIONS 

•Sv  I  rpOS  rs  an  oh|ect-oriented  data  model  that  provides  Fine-grained  protection  of  data  by 
proff'cun.e  a  single  kind  of  secuntv  entity,  namely  elements  It  contains  many  details  that  may 
be  o;  use  in  future  models.  Existence  information  (;an  use  most  operations  defined  on  ordinary 
attnbutps  Eicmen!  levels  can  he  changed  by  unp.nvileged  operations,  though  not  as  flexibly  as 
wc  would  like  .Another  contribution  is  that  the  mode!  specifies  operations  in  detail.  Working 
out  the  details  led  to  many  improvements  whose  necessity  was  not  obvious  without  operation 
delinitionh,  esncciallv  for  deletions  and  regrading. 

Wc  discussed  the  role  of  an  MLS  object  mode!  in  an  MLS  DBMS,  to  distinguish  the  layer 
at  which  features  should  be  specified.  This  practice  helped  in  reducing  the  size  of  the  TCB. 

The  motivation  tor  rnany  design  decisions  was  discussed.  We  devoted  considerable  attention 
to  keeping  the  model’s  specifications  consistent  with  priorities  and  practice  in  the  OODBMS 
industry  especially  compatibility  with  existing  languages  and  good  performance.  We  did  not 
relv  on  encapsulation  that  violates  F'+y  niles  and  is  difficult  to  assure,  did  not  change  the 
language  semantics  to  make  integrity  checks  mandatory,  and  treated  methods  a.s  procedures 
rather  than  security  events. 

We  conclude  that  simultaneous  avoidance  of  channels  and  polyinstantiation  is  feasible, 
bm  leads  to  users  having  considerably  less  flcxibilitv  (especially  for  Delete  and  Upgrade)  than 
in  the  polyinstantiated  UFOS  model. 
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APPENDIX  A.  INTEGRITY  ENFORCEMENT  TRADEOFFS 

We  argue  here  that  certain  kinds  of  integrity  enforcement  should  be  performed  only  if  requested 
by  a  data  administrator  who  accepts  their  costs.  We  also  discuss  reasons  why  further  research 
may  be  needed  before  MLS  technology  for  integrity  enforcement  can  be  robust.  Our 
^guments  are  based  primarily  on  the  nonsecurity  requirements  of  the  OODBMS  market,  rather 

than  on  covert  channel  issues.  The  reduction  in  the  size  of  the  Trusted  Computing  Base  CTCB's 
IS  a  bonus  &  v  . 

We  discuss  referential  integrity  in  detail  and  then  mention  some  multilevel  integritv 
constraints  for  which  similar  arguments  apply. 

A.l  Referential  Integrity 

A  referential  integrity  constraint  asserts  that  the  database  contain  only  valid  pointers  That 
is^  It  some  element’s  value  is  a  nonnull  handle  (denoted  h\  then  the  database  must  contain  an 
obiext  r>  such  that  a  =  th.  One  decision  in  UFOS  that  has  proved  controversial  is  that  we  allow 
handle-valued  attnbutes  for  which  referential  integrity  is  not  enforced.  We  advocate  accepting 
reduced  integnty  primarily  because  we  wish  to  conform  to  practice  in  nonsecure  DBMSs  where 
references  are,  by  default,  unchecked.^ 

Nonsecure  DBMSs  accept  suboptimal  integrity  in  order  to  obtain  compatibility  and 
performance.  If  all  references  were  subject  to  integrity  checks,  several  problems  would  arise: 

Existing  programs  would  now  crash  whenever  they  performed  updates  that 
invalidated  references.  Such  programs  need  not  be  incorrect— they  may  simply  defer 
the  reference  check  till  the  reference  is  used. 

When  a  value  is  assigned  to  any  element  of  type  Handle,  one  must  check  that  the 
reference  was  valid  (since  the  argument  comes  from  untrusted  user  code).  Thus,  an 
assignment  that  could  have  been  performed  on  a  cUent  might  now  require  access  to 
disks  on  a  server.  One  might  also  need  to  update  a  object  reference  count,  possibly 
requiring  that  an  update  go  to  the  server. 

The  human  creator  of  an  object  a  may  insist  on  the  right  to  delete  it,  even  if  another 
user  holds  a  handle.  Referential  integrity  can  then  be  enforced  only  if  the  DBMS 
maintains  a  costly  inverse-traversal  stnicture  that,  for  each  o,  identifies  all  elements 
whose  values  are  handles  of  n.  Every  update  to  a  handle-valued  element  causes  an 
update  to  this  structure. 

We  do  believe  that  the  data  administrator  ought  to  have  the  option  of  requesting 
enforcement  on  any  reference- valued  attribute.  However,  even  in  non-MLS  systems,  the 


r++  and  Smalltalk  nffer  no  referential  integrity  checks,  while  OODBMS  products  often  check  only  when 
explicitly  requested.  In  relational  systems,  an  attribute  can  be  joined  with  a  foreign  key  even  if  no  foreign 
key  constraint  has  been  declared;  these  are  references  with  neitlier  integrity  checks  nor  type  checks. 
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ABSTRACT 

Wo  desire  a  mechanism  for  defining  security  policies  that  is  both  flexible  and  easy  to  use.  This  allows 
us  to  manage  dynamic  changes  to  database  schemas  and  security  policies  with  the  least  amount  of  effort. 
Any  such  mechanism  will  therefore  have  to  be  (1)  discretionary  (for  changing  the  security  policy)  and  (2) 
defined  over  abstract  data  (for  changing  the  databjvse  schema).  Currently,  the  only  security  mechanisms 
with  these  features  operate  on  views.  Despite  their  abstractness,  specifying  a  policy  using  views  requires 
detailed  knowledge  of  how  the  views  are  implemented.  Furthermore,  because  of  the  view  update  problem, 
we  need  a  special  mechanism  for  enforcing  modification  policies  over  views,  one  that  reveals  the  structure  of 
the  underlying  data  to  any  user  that  is  allowed  to  modify  the  database. 

This  paper  proposes  a  new  data  abstraction  called  the  reconfigurable  data  object,  with  properties  that 
simplify  the  creation  of  abstract  security  policies.  In  particular,  it  will  allow  us  to  define  modification  policies 
over  our  database  while  hiding  implementation  details  from  both  the  end  user  and,  potentially,  the  person 
responsible  for  specifying  and  maintaining  the  overall  database  security  policy. 

‘This  research  was  not  funded  by  any  grant  whatsoever. 
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1  Introduction 


Abstraction  is  a  fundamental  concept  in  computer  science:  it  simplifies  the  process  of  solving  problems 
by  eliminating  details  of  the  problem  space  that  are  not  relevant  to  solving  the  problem.  Maintaining  tlie 
security  of  information  stored  in  a  database  is  a  problem  we  are  interested  in  solving,  and  we  use  abstraction 
to  ignore  details  about  data  representation  and  database  implementation  in  creating  a  set  of  mechanisnis 
to  enforce  security.  Abstreiction  is  a  particularly  useful  concept  for  this  problem,  because  we  can  also  use  it 
to  hide  details  about  database  implementation  from  the  DBMS  user,  making  the  DBMS  easier  to  use  and 
increasing  its  security.  This  explains  why  views  have  been  used  by  database  administrators  to  restrict  access 
to  information  stored  in  a  database. 

Views  are,  however,  an  insufficient  abstraction  for  specifying  modification  policies — policies  that  de- 
termiiK!  when  data  may  and  may  not  be  modified — because  views  in  general  cannot  be  updated  directly. 
Furthermore,  views  do  not  hide  their  own  implementation  details,  and  therefore  policies  over  them  can  be 
just  as  complex  and  “implementation  dependent”  as  policies  over  base  relations.  What  we  desire  is  a  data 
abstraction  that  hides  its  own  implementation  details  as  well  as  those  of  the  underlying  data.  Security  poli¬ 
cies  defined  over  such  an  abstraction  would  be  easier  to  specify  (because  there  are  fewer  details  to  remember) 
and  more  secure  (because  less  is  known  about  how  the  database  is  implemented). 

This  paper  is  about  just  such  an  abstraction,  which  we  call  the  reconfigurable  data  object  (or  RDO  for 
short).  With  this  abstraction,  we  can  create  security  policies  that  are  discretionary  and  schema-independent. 

Section  2  defines  reconfigurable  data  objects  and  provides  a  few  examples.  Section  3  shows  how  we  can 
specify  modification  policies  over  RDOs.  Section  4  defines  algorithms  for  enforcing  these  security  policies, 
and  demonstrates  how  they  work.  Section  5  compares  view-based  and  RDO-based  modification  policies 
through  an  example  (the  former  differing  radically  from  standard  view-based  policies).  Finally,  Section  6 
offers  some  concluding  remarks. 

2  RDOs  and  Related  Concepts 

Before  we  can  define  what  an  RDO  is,  we  need  to  explicate  the  concept  of  a  function  over  a  database — in 
our  case,  a  relational  database.  Many  existing  DBMSs  allow  the  creation  of  functions  that  consist  of  a  query 
in  their  own  DML  containing  one  or  more  parameters  to  the  function.  When  the  function  is  called,  the 
parameters  are  bound  to  the  values  passed  as  arguments  to  the  function,  and  the  query  is  evaluated.  For 
example,  consider  the  following  EXCESS  function  [CDV87,  p  24]: 

define  Employee  function  yoimgerKidsCmaxAga :  int4)  returns  int4 

( 

retrieve  (count (C  from  C  in  this. kids  where  C.age  <  maxAge)) 

) 
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The  ftinction  yoimgerKids  has  one  parameter  (maxAge),  and  its  body  is  an  EXCESS  query  that  uses  maxAge 
in  the  where  clause  of  the  query.  The  value  returned  by  the  function  call  youngerKids(lO)  is  the  value 
returned  by  the  body  of  the  function  with  maxAga  bound  to  the  value  10 — in  other  words,  the  number  of 
children  of  all  people  in  the  Employee  relation  that  are  less  than  10  years  old.  Definition  1  formahzes  this 
concept . 

Definition  1:  A  database  function  f  is  a  function  of  one  or  more  parameters  whose  body  is  a 

SQL  query  with  the  following  properties- 

•  All  tuple  variables  are  explicitly  named 

«  All  parameters  of/ appear  at  least  once  in  the  WHERE  clau.se  □ 

Figure  2.1(a)  defines  a  database  function  salary  that  satisfies  Definition  1:  it  has  a  single  parameter  emp 
that  IS  used  in  the  WHERE  clause  of  the  function,  and  the  body  of  the  function  is  a  SQL  query  with  one  tuple 
variable  named  E.  When  the  function  is  called  with  an  employee  ID  number  as  an  argument,  the  function 
will  return  the  salary  of  that  employee.  But  in  order  for  this  to  work,  the  function  query  must  be  evaluated 
against  a  database  with  a  schema  that  supports  the  relations  and  attributes  used  in  salary.  Without  such 
a  database,  we  cannot  evaluate  a  call  of  function  salary.  Definition  2  formalizes  this  concept. 

Definition  2;  Let  /  be  a  database  function,  and  27  be  a  relational  database.  2?  supports  f  iff: 

1  For  every  relation  name  2?  in  the  FROM  clause  of  /,  R  is  in  the  schema  of  P 

2.  For  every  expression  of  the  form  V.attr,  where  F  is  a  tuple  variable  over  relation  R  in  the 
body  of  /,  attr  is  an  attribute  of  R  in  the  schema  of  P  □ 

The  database  in  Figure  2.1(b)  supports  function  salary  in  Figure  2.1(a):  it  contains  a  relation  named 
Employee,  which  contains  attributes  wage  (from  the  SELECT  clause)  and  eid  (from  the  WHERE  clause).  The 
relation  also  has  attributes  name  and  dept,  which  are  not  used  by  the  database  function  and  therefore  do 
not  affect  the  satisfaction  of  Definition  2. 

With  a  formal  definition  of  database  functions,  and  the  support  of  such  a  function  by  a  database,  we  can 
now  give  a  formal  definition  of  the  value  of  a  call  to  a  database  function.  This  definition  is  given  below,  with 
examples  of  its  u.se  given  in  Figure  2.1(c). 

Definition  3:  Let  f  be  a  database  function,  P  be  a  database  that  supports  /,  and  a  be  a  list 
of  values  The  value  returned  by  a  call  of  /  with  argument  list  a  evaluated  against  P  (denoted 
/(rt.  P))  is  the  value  returned  by  evaluating  the  body  of  /,  with  the  parameters  of  /  bound  to  their 
respective  values  in  a,  against  the  corresponding  tuples  of  P.  □ 

Figure  2.1(c)  shows  the  value  returned  by  two  different  calls  to  our  database  function  salary  when 
evaluated  against  the  database  in  Figure  2.1(b).  For  the  first  call,  there  is  only  one  assignment  of  tuples 
to  our  tuple  variable  that  satisfies  the  WHERE  clause  of  salary — namely,  assigning  E  the  fourth  tuple  of 
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Employee — therefore  the  function  returns  a  singleton  set.  For  the  second  call,  there  is  no  such  assignment 
(as  there  is  no  tuple  in  Employee  with  an  eid  attribute  of  21250),  therefore  the  call  returns  the  empty  set. 


(a)  (b) 

salary(200)  =  {21250} 

salaryfsalary  (200))  {} 

(c) 

Figure  2  1  The  creation  and  use  of  database  functions:  (a)  a  database  function  definition,  (b)  a 
sample  database  containing  the  relation  used  by  our  database  function,  (c)  two  sample  database 
function  calls  and  their  values  given  our  sample  database. 

Having  formalized  the  concept  and  attributes  of  database  functions,  we  now  define  the  basic  abstraction 
of  our  security  model:  the  reconfigurable  data  object. 

Definition  4:  Let  /  be  a  database  function  and  a  be  a  list  of  values  such  that  for  some  database  V 
that  supports  /,  f(a,  V)  is  nonempty.  Then  the  pair  (/,  a)  is  a  reconfigurable  data  object  (or  RDO). 

□ 

An  RDO  is  a  syntactic  construct  consisting  of  a  database  function  name  and  a  proper  list  of  arguments 
to  the  function.  For  example,  (salary,  (200))  is  the  RDO  that  corresponds  to  the  first  function  call  in 
Figure  2.1(c).  Do  not  confuse  the  RDO  (/,  a)  with  the  function  call  /(a,  P),  RDOs  represent  abstract  values 
that  are  independent  of  the  structure  and  contents  of  the  database,  and  independent  of  the  function  bound 
to  /.  This  not  only  reduces  the  amount  of  information  a  user  needs  to  know  to  obtain  a  value  from  the 
database,  it  also  allows  great  flexibiUty  in  reconfiguring  the  structure  of  the  database.  Furthermore,  and  for 
our  purposes  most  importantly,  this  abstraction  hides  both  the  contents  of  the  database  and  the  algorithm 
used  to  evaluate  the  RDO  from  the  user,  making  inference  of  unauthorized  information  that  much  more 
difficult. 

In  the  next  section,  we  show  how  to  define  modification  policies  with  RDOs  as  the  unit  of  protection, 
and  what  such  pohcies  mean. 

3  Creating  Policies  over  RDOs 

We  have  created  a  syntax  for  specifying  security  policies  over  RDOs,  the  syntax  was  designed  to  be  easy 
to  use  and  to  support  a  wide  range  of  security  policies.  Ease  of  use  again  means  abstraction — there  are  no 


salaoryCemp) 

{ 

SELECT  E . wage 
FROM  Employee  E 
WHERE  E.eid  =  emp 

} 


Employee 


eid 

name 

dept 

wage 

100 

Aaron 

1 

55000 

101 

Baker 

1 

35000 

102 

Chase 

1 

37500 

200 

Davis 

2 

21250 

impli  mentation  details  in  the  policy  specification^ — but  it  also  means  keeping  track  of  as  few  details  about 
the  policy  itself  as  necessary.  Supporting  a  wide  range  of  policies  means  that  our  policies  are  discretionary, 
as  mandatory  policies  place  tight  restrictions  on  access  and  modification  based  on  information  flow. 

3,1  Syntax 

A  policy,  in  our  syntax,  is  simply  a  list  of  statutes,  each  statute  placing  specific  restrictions  over  a  set  of 
RDOs  based  on  properties  of  the  RDO.  Since  we  are  interested  here  in  modification  policies,  we  will  only 
describe  the  syntax  of  statutes  on  modification.  Such  statutes  can  take  one  of  two  formats: 

Format  1:  ALLOW  MODIFY  fip)  WHERE  pr(p} 

Format  2:  DISALLOW  MODIFY  fip)  WHERE  vr(^ 

where  f  is  a  database  function,  ^  is  a  list  of  parameter  names  corresponding  to  the  parameters  of  /,  and 
pr{p:  is  a  boolean  predicate  over  the  parameters  named  in  p.  Informally,  each  statute  specifies  a  list  of 
RDOs  whose  value  the  DBMS  user  is  allowed  for  not  allowed)  to  modify.  This  includes  every  RDO  of  the 
form  if,  a),  where  /  is  the  database  function  mentioned  in  the  statute  and  o  is  a  list  of  arguments  to  /  that 
cause  pr(p}  to  evaluate  to  TRUE  when  parameter  list  pis  bound  to  a.  For  example,  the  statute 

ALLOW  MODIFY  salary (emp)  WHERE  emp  =  200 

allows  a  user  to  change  the  value  associated  with  RDO  (salary,  (200)},  which  corresponds  to  the  salary  of 
the  employee  with  ID  number  200.  Note  that  this  statute  makes  no  reference  to  any  property  of  the  person 
performing  the  query  (in  particular,  no  reference  to  a  user  clearance  level),  and  therefore  applies  to  any 
DBMS  user.  If  we  want  the  semantics  of  a  statute  to  depend  on  who  is  performing  the  modification,  we 
need  some  extra  syntax.  If,  for  example,  we  let  $EID  be  a  token  that  evaluates  to  the  employee  ID  number 
of  the  current  DBMS  user,  the  statute 

DISALLOW  MODIFY  salary (e)  WHERE  e  =  $EID 

prevents  the  user  from  modifying  the  v'alue  of  RDO  (salary.  ($EID)),  which  corresponds  to  the  user’s  own 
salary. 

The  previous  two  example  statutes  have  been  quite  simple:  the  predicate  compares  a  single  parameter  to 
a  constant  value.  More  complex  policies  will  be  not  merely  value-based  but  content-based,  dependent  on  the 
current  contents  of  the  database.  To  ensure  that  our  policies  remain  independent  of  the  database  schema 
(like  the  RDOs  over  which  our  policies  are  defined),  we  require  that  all  accesses  to  the  database  be  through 
database  function  calls.  For  example,  in  the  statute 

ALLOW  MODIFY  salary (emp)  WHERE  salary ($EID)  >  salary (emp) 
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the  ‘  i  action  salary  is  called  twice  in  the  predicate  This  statute  allows  the  user  to  modify  the  salary  of  any 
employee  whose  salary  is  smaller  than  their  own.  The  statutes  assume  the  existence  of  a  current  database 
that  supports  all  database  functions  used  by  a  policy.  Thus,  if  we  make  the  database  in  Figure  2.1(b)  the 
current  database,  the  preceding  statute  allows  employee  Aaron  to  modify  every  employee’s  salary  but  her 
own. 


3,2  Semantics 

In  the  previous  section,  we  defined  the  semantics  of  statutes  in  terms  of  the  set  of  RDOs  that  satisfy  the 
predicate  of  the  statute.  Taking  our  first  statute  as  an  example,  the  only  value  we  can  assign  to  emp  to 
satisfy  the  predicate  emp  =  200  is  200 — therefore  the  only  RDO  we  allow  the  user  to  modify  under  this 
statute  is  (salary,  (200)).  But  the  abstractness  of  our  RDOs  presents  a  semantic  issue  we  don’t  face  with 
more  concrete  data  objects  like  relations  and  tuples:  their  contents  can  overlap.  This  means  that  when  we 
modify  the  value  of  one  RDO  we  may  also  be  modifying  the  value  of  other  RDOs  at  the  same  time.  For 
example,  consider  the  database  function  pay  defined  in  Figure  3.1(a)  below: 

pay  (emp)  Department 


did 

dname 

bonus 

SELECT 

E.wage  +  D. bonus 

1 

Admin 

1000 

FROM 

Employee  E,  Department  D 

2 

Sales 

100 

WHERE 

E.eid  =  emp  AND  D.did  =  E.dept 

3 

R&D 

0 

} 

(a)  (b) 

Figure  3.1:  Example  of  potential  overlap  of  RDOs  (a)  new  database  function  (b)  extension  to 
the  database  to  support  the  new  function 

If  we  wanted  to  change  the  value  of  salary(emp,  V)  for  a  given  emp  and  V,  we  must  change  the  value  of 
the  wage  field  of  the  corresponding  tuple  in  Employee  in  V.  For  example,  if  Z?  is  our  sample  database, 
then  changing  the  value  of  wage  in  the  fourth  tuple  of  Employee  will  change  the  value  of  (salary,  (200)) 
accordingly.  If  we  now  add  the  relation  Department  from  Figure  3.1(b)  to  V,  the  same  modification  to  the 
wage  field  changes  the  value  of  (pay,  (200))  as  well  as  (salary,  (200)).  Therefore  the  statute 

ALLOW  MODIFY  salary (emp)  WHERE  emp  =  200 

must  allow  us  to  modify  both  of  these  RDOs,  as  one  cannot  change  the  value  of  (salary,  (200))  without  also 
modifying  the  value  of  (pay,  (200)).  This  raises  a  semantic  issue  with  respect  to  policies:  if  a  modification  m 
of  the  contents  of  a  database  V  changes  the  value  of  two  distinct  RDOs  (/i.  Si)  and  (/2, 02),  when  should  we 
allow  m  to  take  place:  (a)  when  both  RDOs  satisfy  an  ALLOW  MODIFY  statute  from  the  policy,  or  (b)  when 
at  least  one  RDO  satisfies  an  ALLOW  MODIFY  statute  of  the  policy?  We  opt  for  the  latter  interpretation, 
because  it  reduces  the  amount  of  information  a  policy  writer  needs  to  know.  In  particular,  if  a  new  database 
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function  is  added  to  the  system,  the  policy  writer  need  not  worry  about  adding  ALLOW  MODIFY  statutes  over 
the  new  function  just  to  maintain  the  current  policJ^ 

Another  unusual  semantic  feature  of  RDOs  is  that  one  can  change  the  contents  of  the  database  without 
changing  the  value  of  any  RDO.  In  fact,  it  is  possible  to  change  the  value  of  tuple  fields  used  to  calculate 
an  RDO  without  changing  the  actual  value  calculated.  For  example,  if  we  change  the  wage  field  of  the  last 
tuple  in  Figure  2.1(b)  from  21250  to  21300  and  the  bonus  field  of  the  second  tuple  in  Figure  3.1(b)  from  100 
to  50,  the  value  pay(200  ,!>)  is  still  21350,  even  though  both  of  these  fields  are  used  to  calculate  this  valued 
From  OTir  standpoint,  any  change  to  the  database  that  doesn’t  change  the  value  of  any  RDO  doesn’t  count 
as  a  modification,  and  therefore  will  be  allow'ed  regardless  of  the  current  policy.  To  sum  up,  the  set  of  RDO:- 
that  an  ALLOW  (DISALLOW)  MODIFY  statute  will  allow  (not  allow)  to  be  modified  is: 

•  PDOs  whose  database  function  matches  the  statute  function  and  whose  argument  list  satisfies  the 
statute  predicate 

•  any  RDOs  whose  value  changes  as  a  result  of  modifying  said  RDOs  (because  of  overlap) 

Any  RDOs  whose  value  doesn’t  change  as  a  result  of  a  modification  to  the  database  have  no  effect  on  whether 
or  not  the  modification  is  allowed. 

As  a  policy  is  simply  a  list  of  statutes,  the  set  of  RDOs  one  can  modify  under  a  policy  is  based  on  the 
individual  statutes  of  the  policy.  An  RDO  r  may  be  modified  under  policy  P  iff 

•  There  is  at  least  one  ALLOW  MODIFY  statute  ,3  4  6  P  that  covers  r 

•  There  are  no  DISALLOW  MODIFY  statutes  sp  €  P  that  cover  r 

The  syntax  and  semantics  we’ve  provided  for  specifying  modification  policias  is  flexible  (because  it's 
dLscretionary  and  value-based)  and  abstract  (because  all  references  to  data  are  made  through  database 
function  calls).  But  how  would  one  enforce  such  a  security  policy?  This  is  the  subject  of  the  next  section. 

4  Enforcing  Our  Policies 

In  existing  discretionary  security  policies,  the  units  of  protection  are  sufficiently  concrete  that  enforcing 
secur.ity  is  straightforward.  System  R  for  example  protects  tables  and  views,  and  one  can  simply  maintain  a 
Ust  foi  each  user  of  the  tables  and  view.s  they  are  permitted  to  access.  Because  of  the  abstractness  of  RDOs, 
one  cannot  simply  maintain  a  list  of  which  database  objects  the  user  is  allowed  to  operate  on:  policies  can 
depend  on  the  contents  of  the  database  and  database  functions  may  be  rewritten  to  keep  up  with  changes 
to  database  schemas.  Essentially,  we  need  to  dynamically  determine  which  RDOs  have  changed  value  when 
a  change  is  made  to  the  database.  For  example,  a  change  in  the  bonus  field  of  the  first  tuple  of  Department 
will  change  the  value  of  three  RDOs:  (pay,  (100)),  (pay,  (101)),  and  (pay,  (102)). 

*Th)s  assumes  that  both  modifications  occur  within  a  single  transaction. 
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flus  tfisk  is  quite  complex  in  general,  so  we  have  made  several  simplifying  assumptions,  which  we  will 
explain  as  we  go  along.  The  following  subsection  gives  a  brief  outline  of  our  approach  to  solving  this  task, 
the  remaining  subsections  briefly  explain  the  algorithms  for  achieving  this  solution.  There  will  be  a  running 
example  of  a  policy  and  its  enforcement  to  demonstrate  this  process. 

4.1  Outline  of  Approach 

(.liver;  a  set  ol  database  functions  T ■,  a  database  T>  that  supports  J-,  a  modification  policy  P  whose  statutes 
use  functions  from  T,  and  a  modification  m  of  the  contents  of  V,  our  task  is  to  determine  whether  ri'  is 
perra-Mf'fl  under  P  We  divide  this  task  as  follows: 

i  lor  each  database  function  /  that  is  covered  by  a  statute  in  P,  determine  whether  /  belongs  to  a 
subclass  called  conjunctive  locator  functions.  If  not.  all  statutes  in  P  that  cover  f  are  ignored. 

2.  F’or  each  function  ft  that  belongs  to  this  subclass,  construct  an  internal  table  of  all  RDOs  over  fi  that 
have  nonempty  values  when  evaluated  against  V.  Store  the  value  of  the  RDO  with  its  identifying 
mformation. 

3.  Perform  modification  m  on  23. 

4:  Re-evaluate  every  RDO  r  in  the  internal  tables  and  compare  the  new  value  with  the  stored  value.  If 
the  two  values  are  different,  add  r  to  a  set  R. 

5.  Compare  every  RDO  r  6  R  to  every  statute  s  6  P  to  determine  whether  s  covers  r.  If  there  is  any 
DISALLOW  MODIFY  statute  in  P  that  covers  any  r  e  R,  undo  the  modification.  Otherwise,  if  there 
is  no  ALLOW  MODIFY  statute  in  P  that  covers  any  r  €  R,  undo  the  modification.  Otherwise  let  the 
modification  stand. 

We  now  consider  each  of  these  steps  in  turn,  with  the  following  running  example  to  demonstrate  our 
approach:  let  T  contain  the  two  database  functions  previously  defined  (salary  and  pay),  23  contain  the  two 
relations  previously  defined  (Employee  and  Depuitment),  :!nd  P  be  the  following  policy: 

ALLOW  MODIFY  salary (emp)  WHERE  emp  =  $EID 
DISALLOW  MODIFY  pay (emp)  WHERE  salary (emp)  <  30000 

4.2  Before  the  Modification 

Before  we  are  ready  to  enforce  our  policy,  we  must  ensure  that  we  are  in  the  proper  initial  state.  Basically, 
we  must  (a)  record  the  existence  of  every  RDO  whose  value  could  be  affected  by  the  modification  and  (b) 
record  the  current  value  of  these  RDOs  so  that  we  can  compare  this  value  to  the  value  after  the  modification. 
This  process  is  difficult  to  generalize,  therefore  we  have  restricted  our  attention  to  a  subclass  of  database 
functions  called  conjunctive  locator  functions,  for  which  we  have  defined  algorithms  that  are  provably  correct 
with  respect  to  our  intended  semantics.  This  subclass  is  defined  as  follows: 

Definition  5:  A  database  function  /  is  a  conjunctive  locator  function  ifi’: 

•  For  any  database  V  that  supports  /  and  RDO  (/, a),  /(a, 23)  evaluates  to  either  the  empty 
set  or  a  singleton  set 


•  For  any  list  of  tuples  t  that  may  be  assigned  to  the  tuple  variables  of  f,  there  is  no  more 
than  one  RDO  {/,  a)  such  that  the  WHERE  clause  of  f  is  satisfied  when  the  parameters  of  / 
are  bound  to  3  and  the  tuple  variables  of  /  are  bound  to  t 

•  The  WHERE  clause  of  /  is  a  conjunction  of  equality  tests  among  (a)  function  parameters,  (b) 

tuple  variable  attributes,  and  (c)  literal  values  O 

Both  of  the  database  functions  in  our  running  example  are  conjunctive  locator  functions.  Functions  with 
tliese  properties  simplify  the  creation  of  internal  tables,  which  are  used  to  record  the  current  value  of  all 
RDOs  with  nonempty  values. 

Definition  6:  Given  a  conjunctive  locator  function  /  and  database  T>  that  supports  /,  the 
internal  table  for  /  over  T>  (denoted  r{/,I>)}  is  a  relation  with  the  following  properties; 

•  The  relation  schema  contains  one  attribute  for  each  parameter  of  /  and  one  value  attribute. 

•  For  every  RDO  {f,a)  such  that  f(.a,'D')  is  nonempty,  there  is  a  tuple  t  in  Tif.V)  such  that 

i  a)  the  attribute  for  a  given  parameter  of  /  contains  the  value  of  the  corresponding  argument 
m  a  and  (bj  the  value  attribute  contains  the  value  f(a,T>'j  O 

Figure  4  1  .sliows  the  internal  table.s  for  our  twr  databa.se  functions  salary  and  pay.  Notice  that  there 
is  '.me  iupie  in  each  of  these  tables  for  every  tuple  in  Employee.  We  give  an  algorithm  for  calculating  the 
coat.ent.s  of  an  internal  table  below:  the  re.striction  nf  -our  method  to  conjunctive  locator  functions  is  primarily 
to  .simplify  the  creation  of  these  interna!  tables. 


Algorithin  1  (creating  internal  tablesb 

For  every  legal  a,s.signment  of  tuples  in  V  to  tuple  variables  in  /. 

1  Substitute  the  values  of  the  fields  of  the  corresponding  tuples  for  the  tuple  attribute  expres¬ 
sions  in  the  body  of  / 

2  If  there  is  an  assignment  of  values  to  the  parameters  of  f  suclr  that  the  WHERE  clause 
evaluates  to  TRUE,  add  a  tuple  to  T(f,V)  assigning  each  parameter  field  the  value  bound 
fo  that  parameter  and  assigning  the  value  field  the  value  of  the  expression  in  the  SELECT 
clause  of  f 

To  see  how  this  algorithm  works,  let  /  be  salary  and  assign  tuple  variable  E  the  first  tuple  of  Employee. 
Then  E  eid  is  assigned  the  value  100,  and  we  set  parameter  emp  to  100  to  make  the  WHERE  clause  evaluate 
to  TRUE.  Since  w/e  were  able  to  satisfy  the  clause,  we  add  a  tuple  to  r(salary,D)  with  an  emp  value  of  100 
and  tlic  value  attribute  set  to  E.wage  or  55000 


T(  salary,  P)  T(pay,  P) 

emp  value  emp  value 


100 

55000 

100 

56000 

101 

35000 

101 

3G000 

102 

37500 

102 

38500 

200 

21250 

200 

21350 

Figure  4.1;  The  internal  tables  generated  by  using  Algorithm  1  against  database  V 
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4.3  Making  the  Modification 

Once  the  internal  tables  have  been  properly  set  up,  we  can  modify  the  database.  Note  that  our  modification 
policy  says  nothing  about  how  to  modify  the  database — all  we  need  to  know  is  which  fields  were  modified 
and  what  their  new  values  are.  Views,  on  the  other  hand,  provide  a  fixed  interface  to  the  underlying  database 
that  may  not  map  uniquely  to  the  database.  For  our  running  example,  we  will  change  the  value  of  the  bonus 
field  in  the  second  tuple  of  Department  from  100  to  500. 

Having  made  the  modification,  we  need  to  determine  which  of  our  RDOs  has  changed  value.  We  do  this 
by  taking  each  tuple  t  of  each  internal  table  T(f,V),  extracting  the  argument  list  3  from  t,  calculating  the 
value  f(5,P),  and  comparing  this  value  with  the  stored  value  attribute  in  t.  If  they  are  not  equal,  we  add 
the  RDO  {/,3)  to  set  i?— which  is  initially  the  empty  set.  We  currently  have  two  internal  tables  with  four 
tuples  each. 

Starting  with  the  first  tuple  of  r(salary,P),  we  extract  the  argument  list  from  the  tuple.  The  argu¬ 
ment  list  for  this  internal  table  is  simply  (emp),  and  therefore  (100)  for  the  first  tuple.  We  then  calculate 
salary (100, 75).  where  P  is  the  new  modified  database,  and  get  55000^,  which  is  the  same  as  before.  We 
therefore  do  not  modify  R.  In  fact,  no  RDO  over  salary  will  change  value,  as  no  modification  was  made  to 
Employee,  which  is  the  only  relation  used  by  the  function.  We  therefore  skip  to  the  first  tuple  in  the  pay 
internal  table 

We  evaluate  pay(100,2?)  by  assigning  tuple  variable  E  the  first  tuple  in  Employee  siid  assigning  D  the 
first  tuple  in  Department,  because  that  assignment  satisfies  the  WHERE  clause.  Since  we  modified  the  second 
tuple  of  Department,  not  the  first,  the  return  value  is  the  same  as  before,  namely  56000.  The  only  RDO 
that  changed  value  is  (pay,  (200)),  represented  by  the  last  tuple  in  the  pay  internal  table.  We  therefore  add 
this  RDO  to  our  (previously  empty)  set  R  and  quit  checking  values.  At  this  point,  we  know  of  every  RDO 
over  our  database  functions  that  has  changed  value  with  respect  to  the  current  database. 

We  should  note  that  this  process  of  calculating  which  RDOs  have  changed  value  is  very  expensive,  both 
in  terms  of  time  and  space.  We  have  to  create  an  internal  table  for  every  function  we  are  interested  in,  add  a 
tuple  to  that  table  for  every  RDO  that  has  a  nonempty  value  using  our  current  database,  and  that’s  before 
we  even  start  enforcing  our  policy.  Once  a  modification  is  made,  we  need  to  recalculate  the  value  of  every 
RDO.  even  though  most  will  not  change  value.  And  each  of  these  evaluations  corresponds  to  a  separate 
query  ov'er  our  database,  with  a  separate  scan  of  all  the  relevant  relations.  We  intend  to  develop  a  more 
efficient,  algorithm  for  discovering  which  RDOs  have  been  modified  as  our  research  progresses. 

4.4  Checking  its  Correctness 

Once  we  have  the  set  of  RDOs  whose  value  has  been  modified,  we  can  compare  them  against  the  statutes  of 
our  policy  to  see  whether  they  allow  (or  disallow)  the  RDO  to  be  modified.  As  with  other  positive/negative 

To  be  precise,  we  get  {55000},  as  functions  return  sets  as  values.  But  because  conjunctive  locator  functions  will  never 
return  a  set  of  more  than  one  element,  we  coerce  the  set  into  its  single  element. 


10 


permission  mechanisms^  (e.g.  GRANT/REVOKE  in  System  R  [ABC+76]  and  ALLOW/DIS ALLOW  in 
RRDS  [GO90]).  we  allow  modification  when  at  least  one  positive  permission  (for  us,  an  ALLOW  MODIFY 
statute)  covers  the  RDO  and  no  negative  permissions  (DISALLOW  MODIFY  statutes)  cover  the  RDO**.  But  in 
our  case  RDOs  can  overlap,  therefore  we  have  to  extend  this  a  bit:  a  modification  is  permitted  if  any  of  the 
RDOs  in  our  set  R  are  covered  by  an  ALLOW  MODIFY  statute  in  our  policy  P  and  none  of  these  RDOs  are 
covered  by  a  DISALLOW  MODIFY  statute  in  P 

In  our  running  example,  the  set  of  changed  RDOs  R  has  a  single  element,  which  is  (pay,  (200)).  We 
see  that  the  first  statute  in  P  does  not  cover  this  RDO,  regardless  of  who  made  the  modification,  because 
this  statute  only  applies  to  RDOs  over  function  salary.  The  second  statute  Is  over  the  correct  function, 
therefore  w'e  test  whether  the  predicate  holds  over  this  RDO  -that  is,  whether  salary  (200)  <  30000.  Since 
salary  (200)  -  21250  when  evaluated  against  our  current  database,  the  predicate  holds.  In  sum  then,  no 
ALLOW  MODIFY  statute  holds  and  one  DISALLOW  MODIFY  statute  holds.  Therefore,  the  modification  is  not 
allnv/ed  and  must  be  undone. 

Since  there  were  no  changes  to  the  database,  the  internal  tables  as  they  stand  are  correct,  so  we  ran  go 
directly  to  step  -3  of  our  enforcement  algorithm  {from  Section  4.1)  and  perform  the  next  modification.  Had 
the  modification  been  allowed  (e.g.  a  modification  of  the  wage  field  of  the  first  Employee  tuple  by  employee 
Aaron )  w-e  would  have  to  replace  the  value  attribute  for  all  RDOs  in  R  to  reflect  the  modification  (in  this 
case,  (salary,  (100))  and  (pay,  (lOO))).  Then  wo  could  proceed  to  step  3  as  before, 

5  A  Comparison  of  Views  and  RDOs 

We  are  now  in  a  position  to  compare  the  expressive  power  of  relational  views  and  RDOs  as  a  basis  for 
defining  seeurity  policias  In  this  section,  we  will  define  an  abstract  policy  (based  on  our  existing  relations 
Employee  and  Department),  represent  the  abstract  policy  using  both  views  and  RDOs  as  basic  data  objects, 
and  discuss  how  one  would  enforce  such  a  policy  as  defined  for  both  of  our  data  abstractions. 

5.1  Policy  Representation 

First:,  .let  us  define  an  abstract  policy  (whf're  by  abstract  we  mean  independent  of  the  underlying  data 
representation)  Let  our  abstract  policy  P  be  defined  a.s  follows: 

P:  An  employee  may  modify  tJie  pay  of  all  employees  who  work  in  their  department, 
but  no  others 

In  order  to  implement  thus  policy,  we  will  ntx'd  abstract  definitions  of  the  concepts  “pay”  and  “department” 
of  a  given  employee.  For  RDOs,  we  u.se  the  database  function  pay  defined  in  Figure  3.1  to  define  the  concept 
“pay,’  fhe  equivalent  definition  using  views  is  given  in  Figure  5.1  below. 


’It  should  be  noted  that  prefiles  as  described  in  [OvS92j  take  a  different  approach:  all  positive  permissions  (which  they  term 
necessary  rondtUons)  must  he  satisfied. 

u**"  “’""'A®-"  binding  the  parameters  of  the  database  function  (which  immediately  follows  the  word 

HODIFY  in  the  statute)  to  the  arguments  in  the  RDO  cause.s  the  predicate  of  the  statute  to  evaluate  to  true. 
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CREATE  VIEW  PayView(id,  pay)  AS 
SELECT  E.eid,  E.wage  +  D. bonus 
FROM  Employee  E,  Department  D 
WHERE  E.dept  =  D.did 

Figure  5.1:  A  representation  of  “pay”  using  a  view 

Note  that  we  have  two  attributes  in  our  view  PayView:  one  to  hold  the  pay  amount  and  one  to  identify  the 
employee  whose  pay  it  represents.  We  also  need  representations  for  the  abstract  concept  of  an  employee’s 
department,  these  are  given  (for  views  and  RDOs)  in  Figure  5.2. 


dept (emp) 

{ 

SELECT 

E . dept 

CREATE  VIEW  DeptView  AS 

FROM 

Employee 

SELECT  eid,  dept 

WHERE 

} 

E.eid  =  emp 

FROM  Employee 

(a) 

(b) 

Figure  5.2:  Representations  of  an  employee’s  department  using  (a)  database  functions  and  (b)  views 

Again,  notice  that  the  definition  of  Dept  View  includes  a  field  to  identify  the  relevant  employee,  whereas 
the  database  function  dept  takes  this  identifier  as  an  argument  which  it  uses  to  calculate  the  appropriate 
value.  The  difference  between  these  approaches  becomes  clear  when  we  use  these  definitions  to  represent  our 
abstract  policy  V,  as  shown  in  Figure  5.3. 

ALLOW  MODIFY  P.pay 

FROM  PayView  P,  DeptView  Dl,  DeptView  D2 

WHERE  P.id  =  Dl.eid  AND  D2.eid  =  $EID  AND  Dl.dept  =  D2.dept 

Prdq:  ALLOW  MODIFY  pay(einp)  WHERE  dept(emp)  =  dept($EID) 

Figure  5.3:  Representations  of  abstract  policy  V  using  views  and  RDOs 

As  you  can  see,  T’rdo  is  not  only  simpler  and  shorter  than  "Pview,  but  more  abstract  as  well.  In  the  view 
policy,  the  attributes  we  created  to  store  the  employee  ID  number  (id  in  PayView  and  eid  in  DeptView) 
were  mentioned  by  name.  The  RDO  policy,  on  the  other  hand,  only  mentions  the  value  of  ein  employee 
ID  fthrough  the  database  function(s)  parameter  emp)  and  not  how  that  value  is  internally  represented.  The 
implementation  independence  of  'Prdo  makes  it  feasible  to  separate  the  duties  of  policy  administration  and 
database  administration,  a  feature  we  will  expand  on  in  the  concluding  remarks. 

5.2  Policy  Enforcement 

For  a  policy  specification  to  be  of  £iny  use,  we  must  be  able  to  enforce  the  policy  as  specified.  In  the  case 
of  abstract  policy  V,  this  means  that  we  must  allow  all  modifications  m  by  a  user  n  to  a  database  V  that 
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is  covered  by  V  where  m  does  not  change  the  pay  of  any  employee  e  in  a  different  department  than  u.  For 
example,  we  would  allow  u  to  modify  the  bonus  field  of  the  Department  record  corresponding  to  u  but  not 
for  any  other  Department  record. 

Notice  that  we  could  not  modify  the  pay  attribute  of  PayView  directly,  because  this  attribute  is  the  sum 
of  a  wage  field  and  a  bonus  field,  and  there  is  more  than  one  way  to  modify  these  two  fields  to  obtain  the 
new  value  of  pay.  Cases  like  this  where  a  change  to  a  view  maps  nonuniquely  to  changes  in  the  base  relations 
that  populate  the  view  is  what  we  call  the  view  update  problem — note  that  this  is  a  different  problem  than 
the  one  cited  in  §3.6.2  of  [KS91],  which  concerns  null  key  fields  in  base  relation  tuples  created  from  a  view 
tuple  and.  requires  us  to  restrict  modifications  to  base  relations  only  (e.g..  Employee  and  Department  in 
the  case  of  PayView).  The  algorithm  for  enforcing  policy  T’view  against  these  changes  is  given  in  Figure  5.4 
below: 

1.  Update  PayView  to  reflect  the  changes  to  our  base  relations 

2  Note  all  records  in  PayView  whose  pay  attribute  has  changed 

3,  Assign  each  such  record  to  tuple  variable  P  from  the  WHERE  clause  of  T’view 

4  For  each  tuple  assigned  to  P,  determine  if  there  are  two  tuples  in  Dept  View  that  can  be 
assigned  to  tuple  variables  D1  and  D2  such  that  the  WHERE  clause  of  Pview  is  satisfied 

5.  If  there  are  any  tuples  bound  to  P  that  have  no  corresponding  bindings  to  D1  and  D2  that 
satisfy  the  WHERE  clause,  the  modification  must  be  undone.  Otherwise,  the  modification 
stands. 

Figure  5.4:  Algorithm  for  enforcing  Pview 

The  algorithm  for  enforcing  'Prdo,  on  the  other  hand,  is  given  in  Figure  5.5.  Note  that  the  algorithm 
is  fundamentally  the  same  as  in  Figure  5.4  despite  the  surface  differences:  database  function  evaluation 
replaces  the  relational  scan  of  Pviewi  and  variable  binding  replaces  much  of  the  equijoin  computation. 

1  Update  the  internal  table  for  database  function  pay  to  reflect  the  changes  to  our  base 
relations 

2.  Retrieve  the  emp  attribute  for  each  tuple  of  Tfpay,  V)  whose  value  attribute  changed 

3.  Find  all  tuples  in  ^(dept,  P)  with  the  same  value  in  their  emp  field  as  was  retrieved 

4.  Compare  the  value  attribute  of  all  such  tuples  to  dept  ($EID) .  If  any  of  our  changed  tuples 
from  the  pay  internal  table  has  no  corresponding  tuple  in  the  dept  internal  table  with  thLs 
v'alue  as  its  value  attribute,  the  modification  must  he  undone.  Otherwise,  the  modification 
stands. 

Figure  5  5:  Algorithm  for  enforcing  Prdo 

The  .algorithm  for  enforcing  Prdo  is  simpler  than  the  algorithm  for  enforcing  Pview,  because  dept($EID) 
remains  constant  as  the  former  algorithm  Is  evaluated.  Therefore,  each  pay  table  record  whose  value  has 
changed  will  only  require  a  single-level  scan  of  the  dept  internal  table,  whereas  we  need  a  two-level  scan  of 
DeptView  to  retrieve  tuples  D1  and  D2  to  satisfy  Pview  This  simplification,  however,  should  not  be  seen 
as  a  significant  advantage — as  a  smart  optimizer  could  conceivably  make  a  similar  improvement  to  Pview 
The  primary  advantage  of  using  RDOs/database  functions  to  specify  an  abstract  policy  versus  views  is  the 
implementation  independence  and  simplicity  of  the  former. 
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6  Conclusions 


Nowadays,  we  expect  software  to  be  both  powerful  and  easy  to  use.  Word  processors,  for  example,  allow 
the  user  to  use  many  different  typefaces  and  sizes  of  text,  with  control  over  spacing,  margins,  alignment, 
and  other  aspects  of  the  document.  And  yet  these  packages  are  simple  to  learn  and  use,  because  the  user 
need  not  worry  about  how  information  concerning  typefaces,  font  size,  etc.  is  stored  in  their  document. 
That  is,  the  word  processor  provides  an  abstract  interface  (via  a  WYSIWYG  display  screen)  that  hides  the 
internal  representation  of  the  text  from  the  user,  yet  provides  her  powerful  tools  for  creating  documents,  This 
abstract  interface  also  allows  the  user  to  import  documents  created  using  other  word  processing  packages,  as 
the  interface  the  user  sees  is  independent  of  how  the  formatting  and  textUcJ  information  is  represented.  For 
database  security  systems,  however,  this  expectation  is  not  easily  met. 

For  these  systems,  the  user  of  interest  is  the  person  who  creates  and  maintains  the  security  policy  for 
a  specific  computing  environment.  This  policy  writer  would  like  to  create  policies  like  V  from  the  previ¬ 
ous  section — referring  to  high-level  abstract  concepts  like  “salary”  and  “department”  without  reference  to 
implementation  details  like  field  and  relation  names.  Mandatory  security  is  defined  directly  over  concrete 
relations,  tuples,  and  fields;  therefore,  policies  are  implementation  dependent.  Furthermore,  they  are  re¬ 
stricted  to  information  flow  based  policies.  Views  do  provide  abstraction  in  that  they  hide  the  underlying 
data  representation,  but  one  must  still  know  the  schema  for  these  views  to  write  security  policies  using  them, 
as  we  saw  with  'Pview  Furthermore,  enforcing  Pview  requires  functionality  that  existing  view-based  systems 
don’t  possess.  DB2,  for  example,  will  not  allow  a  user  to  modify  the  contents  of  a  joined  view  like  PayView, 
which  is  different  than  preventing  the  contents  from  changing  value. 

On  the  other  hand,  T’rdo  only  required  that  the  policy  writer  knows  how  to  call  database  functions  nay 
and  dept.  The  implementation  independence  of  RDO-based  policies  allows  us  to  separate  the  duties  of  policy 
management  and  database  management.  The  database  manager  would  manage  the  data  and  define  database 
functions  over  them,  the  policy  writer  would  create  sets  of  statutes  using  these  functions.  This  leaves  the 
database  manager  free  to  change  the  database  schema  as  the  environment  changes  without  worrying  about 
how  it  affects  security®.  Note  that  this  separation  of  powers  is  the  same  form  espoused  by  Clark  and  Wilson  in 
their  seminal  paper  [CW87].  This  work  is  also  related  to  [Cab93],  in  that  data  in  “physical”  form  must  satisfy 
constraints  that  are  written  over  ^abstract”  data  but  in  her  case,  the  physical-to-abstract  transformation 
must  preserve  the  structure  of  the  data  (and  functions  like  pay  do  not). 

This  advantage  in  flexibility  2ind  ease  of  use,  however,  comes  with  a  heavy  price  tag.  Enforcing  a  policy 
is  highly  time-  and  space-consuming.  Policies  are  currently  limited  to  a  relatively  small  class  of  database 
functions®,  because  other  classes  are  even  more  costly  to  enforce.  All  database  access  is  through  database 
functions,  therefore  one  loses  the  flexibility  of  SQL  and  may  have  to  write  hundreds  of  database  functions, 
which  would  substantially  increase  the  cost  of  enforcing  policies.  Finally,  like  views  themselves,  verification 

^See  [Sj092]  for  an  example  of  a  database  whose  scliema  changed  dramatically  in  a  relatively  short  period  of  time. 

The  class  is  however  sufficient,  in  that  one  can  use  conjunctive  locator  functions  to  retrieve  any  attribute  of  any  tuple  of 
any  relation  with  unique  keys. 
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of  code  would  be  difficult,  as  the  enforcement  algorithm  relies  on  much  of  the  DBMS  to  function. 

There  is  much  work  left  to  be  done.  We  need  to  find  more  efficient  algorithms  for  enforcing  security 
policies.  The  class  of  database  functions  we  can  and  should  monitor  needs  to  be  investigated.  There 
are  many  extensions  we  could  make  to  the  current  policy  syntax,  several  are  worth  pursuing.  Finally,  an 
implementation  of  these  concepts  needs  to  be  fleshed  out  to  test  the  effectiveness  of  this  approach.  It  is  clear 
that  we  have  only  laid  the  groundwork,  and  we  have  many  questions  to  answer  before  this  approach  can  be 
deemed  practical. 
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1.  INTRODUCTION 


Security  is  an  important  issue  when  dealing  with  computer  based  information  systems 
and  networks  (1,41,46,50).  Current  thinking  in  information  systems  security  is  that  the 
issues  center  on  confidentiality  (information  is  only  disclosed  to  those  users  who  are 
authorized  to  have  access  to  it),  integrity  (information  is  modified  only  by  those  users 
who  have  the  right  to  do  so),  and  availability  (information  and  other  IT  resources  can  be 
accessed  by  authorized  users  when  needed).  Computer  based  information  systems  are 
''^1  respect  to  at  least  these  issues.  The  level  of  security  that  should  be 

included  in  a  information  system  involves  however  some  judgement  about  the  dangers 
associated  with  the  system  and  the  resource  implications  of  various  means  of  avoiding  or 
minimising  those  dangers.  Developments  in  this  field  have  progressed  today  to  a  point 
where  information  systems  security  needs  to  be  tackled  in  a  coherent  and  consistent 
fashion  as  a  subject  on  its  own  right  (7.41,44,50) 


Database  security  in  particular  is  an  area  of  substantial  interest  in  information  systems 
security  today.  In  addition  to  the  more  common  security  concerns  of  integrity,  access 
control,  audit  etc  .  database  systems  add  concerns  for  granularity,  inference,  aggregation 
filtering,  lournahng  etc.  (1,41,50).  Database  systems  also  provide  new  tools  for  enforcing 
and  contiolling  security.  They  also  make  it  possible  to  increase  granularity  by  enforcing 
security  at  a  record  or  even  at  a  data  item  level.  The  security  of  a  single  clement  may  be 
rhiis  different  from  the  security  of  other  elements  of  the  same  record  or  from  values  of 
the  same  attribute.  That  is,  the  security  of  one  element  may  be  different  from  that  of  other 
elements  of  the  same  database  row  or  column.  Database  .systems  can  also  support  several 
grades  of  security  for  sets  of  data  or  individual  data  items  These  ranges  may  represent 
langcs  of  allowable  knowledge,  which  may  overlap  (2,41). 


Tims  paper  presents  a  set  of  database  security  guidelines  for  the  development  of  a  secure 
database  system.  The  proposed  guidelines  help  ensure  the  fulfilment  of  the  corresponding 
set  of  security  principles,  as  defined  in  the  high  level  security  policy  of  the  specific 
establishment  (47).  Each  .such  principle  is  implemented  through  one  or  more  of  the 
proposed  ten  control  agents.  The  guidelines  also  provide,  to  the  related  categories  of 
personnel,  a  functional  guide  for  the  introduction,  adminrstration  and  enforcement  of  the 
appropriate  level  of  database  security.  Effort  has  been  made  to  avoid  technical  details 
related  to  the  methods  and  techniques  used  for  the  implementation  of  the  guidelines 
which  arc  discussed  in  detail  in  (41 ), 


2  DATABASE  SYSTEMS  SECURITY 

Database  security  is  concerned  with  the  ability  of  the  sy.stcm  to  enforce  a  security  policy 
go'vcrning  the  disclosure,  modification  or  destruction  of  information  Within  an 
organisation  humans  typically  use  a  database  as  a  technical  tool  for  storing  processing 
and  communicating  information,  At  any  time  an  amount  of  data  has  been  stored  in  it  a 
large  amount  of  messages  has  already  been  sent  and  the  corresponding  data  can  be 
called  foi  duplication  and  further  transmission  on  demand  from  potential  receivers  The 
database  relays  the  messages  by  persistently  storing  the  corresponding  data  following 
the  three  phase  procedure  (2):  "accept  messages  =->  store  /  process  data  ==> 
a.s'.semWc.  dupheate  and  communicate  data  on  demand'.  The  quality  of  mediation  is 
dependably  assured  by  special  protocols  enforcing  completion  of  transactions  and 
mtcgiity  constraints  on  stored  data.  Mediation  is  shared  among  many  users  and  is 
required  to  be  efficient  in  time  and  space  (2). 


Databases  are  usually  an  amalgamation  of  data  from  many  sources;  users  entrust  their 
data  to  a  DBMS  and  rightfully  expect  protection  of  the  data  from  unauthorised  access 
loss  or  damage.  Databases  contain  structured  data  that  arc  maintained  by  a  database 
management  system  (DBMS)  which  is  usually  a  separate  software  component  that  runs 
on  the  top  of  the  operating  system  and  provides  the  additional  functions  to  use  the 
database.  It  may  also  include  functions  to  manage  transactions.  A  DBMS  assumes  one  or 
more  data  models  upon  which  the  data  are  structured  (such  as  relations,  networks. 
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h!'_  rarsJiics  etc.  ;  Hatabasc  appiieations  typically  require  a  fine  granularity  of  access 
control  from  a  •.cciinty  view  point,  database  systems  may  be  viewed  as  applications  that 
'tcquin.  considerable  kernel  service  or  as  protected  subsystems  and/or  trusted  processes. 
Databases  may  be  considered  to  provide  another  level  of  security  to  complement  that  of 
the  operating  system, 

Fh  ’  toilowing  assumptions  about  the  database  system  environment  and  general 
onnciplcs  related  to  database  security  have  been  widely  accepted  (10,1 1,14): 

a  The  database  system  security  considerations  must  take  into  account  all  system 
S  'V'  and  H/W  that  touches  information  flowing  into,  and  out  of,  the  database.  For 
ex.ampic  an  easily  penetrated  operating  system  would  usually  render  a  superbly 
protected  DBMS  useless. 

h  Data  mtegnty  is  a  key  requirement.  The  database  system  must  preserve  the 
mtcgritv  of  the  data  stored  m  it.  The  user  must  be  able  to  trust  the  system  to  give 
irack  the  same  data  that  is  put  m  the  so-stem  and  to  permit  data  to  be  modified  only 
b'  authorised  users.  The  data  -should  not  be  destroyed  or  altered  either  accidentally 
!i!  a  sv'stcni  e.rash,  or  maiicinuslv  as  in  some  unauthorised  person  modifying  the 
data  At  the  ■-y ry  least,  the  nscr  should  know  it  the  data  was  corrupted 
I  Data  should  be  available  when  needed,  fliis  implies  system  fault  tolerance  and 
icdnndancy  in  data,  software  and  hardware. 

d  Audi!  should  be  detailed  enough  to  be  iiscfui  and  efficient  enough  so  as  not  to 
severeb  burden  system  performance, 

o  the  atm  should  be  to  provide  an  adequate  level  of  confidentiality  (prevent 
disclosure)  and  yet  preserve  integrity  by  using  appropriate  concurrency  and  integrity 
contnds  (c.g.  referential  integrity). 

I  1  he  prototypes  should  be  ot  genera!  purpose,  commercial  quality  and.  according  to 
mos!  pioposcis.  iclational  systems,  flic  relational  svstem  has  been  chosen  because  it 
1  -  (10)  currently  the  model  of  preference  in  the  commercial  world. 

(neon  die  above  dclmition  and  general  framework  ot  database  security,  we  can  regard  a 
database  as  ;,i  channel  in  the  sense  of  communication  theory.  Then  a  database  security 
policy  slates  (2)  (il  which  type  of  sub  channels  between  (groups  of)  users  can  be 
established,  (u)  the  requirements  of  the  availability  of  certain  facilities  of  the  sub 
channels,  and  (ni)  the  icquircmcnts  on  the  (partial)  separation  and  non-intcrfcrencc  of 
sub  channels.  Seen  fitnn  this  pitinl  of  \'icw.  we  can  identify  two  prominent  proposals  for 
database  security  policies  (1,4!  .4.^ ) 

a  Die  Mandatory  security  approach.  I  he  need  for  such  a  policy  arises  w'hcn  a  computer 
database  s^  stem  contains  information  with  a  r  aricty  of  classifications  and  has  some 
Use.  .  which  aic  noi  cleared  tor  the  highest  classification  ot  the  information  contained  in 
tin  srslcm  The  approach  is  frequently  based  on  the  following  assumption  (constructs): 
there  arc  users,  data  items  and  a  lattice  ol  scciiritv  levels  (  l.dll, 

b  the  discrctionar'  security  approach  Discretional  access  controls  in  today's  database 
systems  ail  (icsigncd  to  enfouv;  a  spccitic  access  control  policy  (1,41).  The  approach  is 
based  on  the  following  assumption  (constructs):  there  arc  users,  (well  informed) 
transactions,  and  (constraint)  data  items  (1.2,41  ) 

i  acli  one  ottcis  a  number  ot  adr’antages  A  basic  distinction  for  example  among  the  two 
IS  in  the  degree  ot  protection  they  provide  from  frojan  horse  programs.  In  general 
discu.tiona!  sccimty  is  set  by  the  use”  and  can  be  defeated  by  Trojan  liorsc  programs, 
while  mandatoiy  sccuiity  is  set  by  the  database  system  and  is  much  more  effective 
against  Trojan  horse  programs. 

.3.  THF  SEISMED  PROJECT 

The  work  reported  in  this  paper  has  been  based  on  research  undertaken  in  the  framework 
M'thc  SEISMED  project  of  the  European  Union  (EU),  The  SEISMED  project  (a  Secure 
Environment  for  Information  Systems  in  MEDicinc)  is  part  of  the  European  Union's  AIM 
(Advance  Informatics  in  Medicine)  program.  AIM  is  currently  investing  some  90  milion 
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EC"U  to  provide  opportunities  for  improving  computer  information  systems  across  Europe 
within  the  medical  environment.  The  project  lasts  for  three  years  (1992-94)  and  is 
implemented  by  a  European  consortium  (41).  It  is  the  only  one  in  the  area  of  information 
systems  security  and  one  of  the  biggest  of  the  program. 

The  main  objectives  of  SEISMED  are: 

To  develop  a  High  Level  Security  Policy  to  enable  organizations  using  information 
systems  to  follow  a  consistent  path; 

To  develop  specific  guidelines  to  enchance  security  in  existing  systems,  in  the  design 
and  implementation  of  future  systems,  and  in  systems  using  networks; 

To  develop  an  encryption  prototype  for  use  in  health  care  environments; 

To  perform  risk  analyses  at  a  number  of  health  care  centers  across  Europe  to  identify 
the  opportunities  and  needs  for  improved  security; 

To  examine,  across  Europe,  the  legal  issues  of  data  protection  and  privacy  with  health 
care  information  systems  in  order  to  develop  a  common  deontology  or  code  of  ethics. 

To  identify  mechanisms  by  which  the  results  of  SEISMED  can  be  put  in  effective  use. 

The  technical  approach  used  in  SEISMED  was  to  break  the  project  into  a  number  of 
intcr-connccting  themes,  which  are: 

-  The  identification  of  current  practices  by  means  of  a  survey  throughout  Europe  and 
detailed  risk  analyses  at  four  healthcare  centres: 

-  The  preparation  of  guidelines  detailing: 

-  the  development  and  implemcntayion  of  a  high  level  security  policy 
how  to  perform  risk  analysis 

-  how  to  include  security  with  a  system's  design 

-  how  to  retrospectively  include  security  within  existing  systems 

-  how  to  achieve  security  where  networks  are  utilised 

-  the  use  of  encryption  in  health  care  environments 

-  the  legal  framework  across  the  European  Union  countries. 

-  To  test  the  implementation  of  these  guidelines  (except  for  the  legal  framework)  by  the 
reference  centres  participating  in  the  project. 

-  To  revise  the  guidelines  in  light  of  the  reference  centre  experiences. 

4.  SECURE  DATABASE  DEVELOPMENT  METHODOLOGY 

The  problem  of  developing  a  secure  database  .system  consists  of  three  main  issues 
{3,4,10): 

(i)  the  definition  of  the  semantics  of  the  secure  database  to  be  developed,  that  is  to 
characterise  the  needed  security  properties  in  terms  of  the  database  semantics, 

(ii)  the  implementation  of  those  semantics  on  a  database  system,  that  is  on  a  DBMS  and 
on  the  data  it  handles,  and 

(lii)  assuring  that  the  implemented  system  provides  the  needed  security  properties. 

•A  development  methodology  has  the  purpose  of  specifying  how  each  one  of  these  three 
secure  database  development  issues  can  be  achieved.  This  is  usually  accomplished  by 
guiding  the  various  steps  of  the  development,  by  providing  modelling  and  analysis  tools, 
and  by  organising  these  three  issues  into  a  global  framework  allowing  to  achieve 
consistency  of  the  whole  development  process  and  of  the  target  .system.  Such  a 
development  methodology  should  be  multiphase  (29)  in  order  to  allow  for  incremental 
construction  of  the  secure  database  system.  The  various  phases  of  the  methodology 
should  also  be  supported  by  models  able  to  represent  the  details  necessary  to  each  phase 
and  to  support  the  types  of  analysis  about  the  system  under  design  which  pertain  to  each 
phase.  Multiphase  development  of  systems  is  a  widely-accepted  principle  today.  The 
benefits  of  a  multiphase  approach  lie  mainly  in  the  separation  between  design  and 
implementation  aspects;  this  is  the  principle  of  security  policies-mechanisms  separation 
(29  ). 
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--fcm  development  methodologies  have  been  described  in 
'S  hased  on  the  same  traditional  generic  ("V" 

-  ^  >  niodcl,  which  mrSudes  the  following  phases  (49);  Requirements  Analvsi'^ 

Specifications  (DES),  Coding  (COD),  Testing  (TES). 

dril  n^VcncrinT  *  ■  '  H  database  development  could  also^be  seen 

■  hal  1.  .cnciallv  ^'^''ned  out  m  a  number  of  similar  steps  (41):  Database  system  security 
a. .unents  analysis  (REQ).  Prototyping  (PRO),  Database  system  design  specification! 
iwhieh  include  conceptual,  logical  and  physical  design.  DES),  Coding  (CTtD)  Database 
^vstem  testing  (TES),  and  Database  system  venfieatmn  (VER).  Sueh\  database  det" 
mctnodology  has  been  described  in  (41).  ® 

'I  f  >s  part  of  the  overall  information 

^  c.  da.abast  security  should  be  seen  as  an  integral  part  of  the  overall  information 

in  considered  in  a  unified  way  as  earlv 

as  posobic  m  information  .systems  design  and  implementation  process.  Fnor  to  the 
dOimtion  imd  implementation  of  database  sccuntv  a  well  defined  oLill  security  pobey 
and  a  .suitable  overall  system  design  methodology  arc  therefore  required.  Such  a  detailed 
i  mlhodologv  and  a  ^ct  of  guidelines  T>r  the  development  of  a  suitable  high  level  security 

).uT«n  £  SKn,?’, P'dcc.  of  the  EEC 

it  the  abinc  slcvelopmcnt  methodology  ,s  !o  be  used  for  the  development  of  a  securitv- 

f  ^i^e  th!  ones  pnipSed  m 

lo  *f.  Tu'  included  in  the  description  ot  all  these  phases  (.5.6,41.47),  The 

giiK  e  iiK.s  le  p  ensure  that  each  ot  (he  database  development  phases  will  be  performed 
according  to  existing  security  standards  and  regulations.  They  help  also  to  provfoc  to  the 
nCim  ?  lunctional  guide  tor  dm  mtroduetion.  administration  and 

!  .  o  V  1  security.  The  database  security 

^-.•tdennes  iiavc  tin-  lollowing  characteristics  ' 

m  foich  guideline  mtotds  to  fultl!  at  least  one  of  the  requirements  for  database  security 
nn  bach  guideline  addrcssc.s  at  least  one  personnel  category,  and 

fill)  They  arc  based  on  the  requirements  for  database  security  and  the  personnel  duties 
and  icsponsibilitics.  the  potential  and  limitations  of  information  systems  and  database 
technologv  todav,  and  the  establishment's  organisational  structure  and  procedures. 

5  I  HFi  CONTROI  AGENTS 

'h  ''  ii!  prmnded  for  the  fulfilment  of  a  set  of  predefined  (in 

y  "'‘•‘-'‘'''ty  police  (Hl.SP))  database  security  principles  (47,49)  Each 

onne, pic  !,s  implemented  through  one  or  more  control  agents.  The  control  agents 

p'A'm'e's.' niorc  ot  the  basic  security  principles  (49).  The 
r>apos.,d  methodology  incorporates  icn  such  control  agents,  namclv:  tools  review 
dmumentarion,  tormalrsm,  traccabilitv.  standardi/ation.  code  reuse.  methodolo«y’ 
.ssponsibilite  and  reliability.  'Ihcir  scope  is  described  m  the  sequel 
-  ’oo/s  I  r.f  S): 

Sntlv.are  products  should  be  employed  hv  the  database  system  designer  to  ensure  that 
mH-  nvr'r  Y  ^-an-ted  out  according  to  efficiency  or 

^^i^mlgc  molanom  ^  P™8^ammmg 

RcMcwfRVWf 

\i!  svsiom  devdopment  phases  should  be  reviewed  by  teams  of  sccuntv  experts  together 
V 1  1  Risers  and  relevant  jcchnica!  experts,  to  ensure  that  all  procedures  have  been 
.Cl  aeeoi  ing  to  sccuiiiv  and  integrity  standards  and  criteria.  Example  Test 

P^^^t  an  rest  procedures  have  been  carried  out 

nocumcntation  (DOC): 

Documentation  should  follow  the  end  of  every  development  phase  and  it  should  be  used 
as  a  retcrcncc  to  the  security  actions  that  should  have  been  performed  that  far.  Example: 
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Documentation  of  the  source  code  of  all  the  database  application  programs  can  help  for 
comparing  the  programs  that  should  run,  with  the  ones  that  actually  do.  This  control 
agent  forms  the  main  input  to  the  verification  process  at  each  stage  as  well  as  to  the 
validation  and  certification  processes. 

4.  Formalism  (FRM): 

Specific  formal  procedures  should  be  followed  in  phases  where  the  complexity  of  the 
development  may  cause  misinterpretation  or  flaw  of  information,  or  other  relevant 
inconsistencies.  Example.  The  conceptual  database  design  should  be  described  in  a 
formal  specification  framework. 

.5.  Traceability  (TRC): 

In  several  cases  capability  of  tracing  back  a  procedure  to  the  requirement  which  has 
originated  this  procedure  should  exist.  Example;  Every  application  program  should  be 
associated  with  its  functional  requirement(s). 

6.  Standardisation  (STD): 

Standards  help  the  database  system  developer  to  employ  a  uniform  set  of  approaches, 
thus  decreasing  the  types  of  techniques  that  are  available  for  malicious  software 
developers.  Example:  Coding  standards  in  database  applications  programming  increase 
the  auditability  and  maintainability  of  the  resulting  code. 

7  Fade  Reuse  (CRU): 

Code  reuse  provides  a  means  of  error  propagation;  therefore  it  should  be  done  according 
to  specific  security  procedures.  Example:  If  prototype  code  is  reused  in  the  development 
code,  then  it  should  be  first  tested  and  verified  towards  well  defined  security  goals. 

8.  Methodology  (MET): 

Specific  methodologies  should  be  used  to  ensure  that  a  procedure  has  been  performed  in 
the  (secure)  way  that  it  should  be.  Example:  All  test  tasks  should  be  performed  according 
to  a  methodology,  capable  of  ensuring  that  all  security  critical  test  data  have  been  used. 

9.  Responsibility  (RES): 

The  nomination  of  person(s)  or  organisation(s)  who  is(are)  responsible  for  the  fulfilment 
of  all  security  goals  is  an  effective  means  toward  preventing  security  violations. 
Example:  The  responsibility  for  the  verification  tasks  should  not  be  placed  with  the  body 
responsible  for  the  database  testing  tasks. 

/  0.  Reliability  (REL): 

The  reliability  of  some  specific  procedures  should  be  measured  and  used  to  reduce 
unwanted  side-effects  to  an  acceptable  level.  Example:  Test  results  should  be  used  to 
reduce  observed  software  flaw  densities. 

6.  GUIDELINES  DESCRIPTION 

Now  we  present  a  structured  description  of  a  set  of  database  security  guidelines  which 
help  ensure  the  fulfilment  of  the  corresponding  set  of  information  system  security 
principles  (as  defined  in  the  HLSP  for  the  specific  establishment,  (41)).  Each  .such 
principle  is  implemented  through  one  or  more  of  the  ten  control  agents  described  earlier. 
As  discussed  in  (49),  information  systems  developed  according  to  the  provisions  set  out 
in  the  proposed  guidelines  enjoy  security-related  functions  efficient  and  integratable  with 
all  other  specified  functions.  Given  the  scope  of  this  paper,  this  list  cannot  be  exhaustive. 
Part  of  fhc  security  guidelines  that  we  have  proposed  in  the  framework  of  the  SEISMED 
project  of  the  EEC  (41)  is  set  out  below  in  order  to  dcmon.strate  the  process  (41,47).  We 
have  tried  to  avoid  in  this  description  technical  details  related  to  the  implementation  of 
the  guidelines.  A  detailed  discussion  and  presentation  of  methods  and  procedures  for  the 
implementation  of  the  proposed  guidelines  can  be  found  in  (41). 

We  can  classify  database  security  guidelines  into  three  main  categories:  (i)  Database 
development  guidelines,  (ii)  Control  of  database  software  guidelines,  and  (iii)  Database 
operational  and  organisational  guidelines.  The  actual  description  of  each  guideline  below 
is  preceded  by  a  header  conforming  to  the  following  format: 

Guideline  short  title  [Code  Number  /Phase  /  Control  Agent] 
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in  this  i'ormal: 

-  "Phase” 

The  database  development  phases  include  the  ones  described 
to  as  follows. 

I.  Database  development  security  guidelines; 

-  Preliminary  analysis  security  guidelines 
Database  system  security  requirements  analysis 
Prototyping  (where  applicable'i 

•  Database  system  design  specifications 
(which  include  conceptual  logical  and  physical  design) 
-Coding  (e.g.  using  DBMS  tools) 

-  Database  system  testing 

■  Database  system  verification 

II.  C.ontro!  ot  database  software  security  guidelines: 

(icncral  precautions 

Software  development 

li.  Operational  and  organisational  security  guidelines: 

-  Database'  organisational.' administration  guidelines 
'  Database  operational  guidelines 

ri'  A"  m  the  development  phase  stands  for  ".M!  phases"). 

"Control  agent": 


earlier,  and  are  refered 


(PRE) 

(REQ) 

(PRO) 

(DES) 

(COD) 

(TES) 

(VER). 

(GPR) 

(SDV) 

(ORG) 

(OPER) 


It  rctcrcs  to  the  control  agent  codes 

described  in  the  previous  section 

1 .  Tools 

-  - ■  TLS 

2,  Review 

— '  RVW 

.4,  Documentation 

—  DOC’ 

4  Formalism 

— >  FRM 

5,  Traccabilitv 

— >  TRC 

6.  Standardisation 

—  >  STD 

7  C'odc  Reuse 

—  '  CRU 

S  Methodology 

---  met 

4  Ffcsponsibihty 

— >  RES 

Hf  Rcliabilitv 

-  •  REL 

(  ()P  ■  in  the  control  target  stands  for  "Overall  Philosophy"), 

7.  DATABASE  SECURITY  GUIDELINES 
7.10  DATABASE  DEVELOPMENT  -  I 

The  .set  of  dic  proposed  sccuritv  guidelines  to  be  followed  in  all  phases  of  the  database 
development  process  is  the  following;  uaiaudsc 

/  intCLmircd  Design  Methodology  [DBG6 It/- 1  4  C/Pj 

Data  protection  procedures  .riioiild  be  specified  along  with  the  information  system's 
development  specifications.  ^ 

7.  Auditing  j DR(i6 ! 0-2/A/OP] 

\  --ccord  of  all  controlled  software  development  operations  should  be  logged  and 
stored  by  the  software  development  department  in  a  nrotccted  repository 
'■  Mediation  I DBdti  10-3/ A/OP]  '  >  y- 

Ail  controlled  development  operations  should  be  automatically  mediated  by  the 
software  development  department  with  respect  to  an  explicit  mandatory  and 
discretionary  security  regulations.  ^ 

4.  Trusted  Path  [DBG6I0-4/A  OP] 

An  explicit  mechanism  should  be  included  m  the  development  department  to  ensure 
lat  all  contTollcd  software  development  operations  cannot  be  imitated  or  intercepted 
by  unauthorised  means.  * 

0.  Identification  -  Authentication  [DBG6 1 0-5  A ’OP] 
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No  developer  should  initiate  or  participate  in  any  development  operation  unless  they 
have  first  been  identified  and  authenticated  by  the  software  development  department 
responsible  person. 

6.  Configuration  Management  [DBG6 10-6/ A/OP] 

All  software  resulting  from  controlled  software  development  activity  should  be  stored 
in  a  protected  repository  that  maintains  and  controls  all  software  versions,  software 
modification  requests,  software  changes,  and  related  information  such  as  modification 
request  initiators  and  change  responsibility. 

7.  En  vironment  Integrity  [DBG61 0-  7/ A/OP] 

An  explicit  procedure  should  be  available  for  identifying  changes  in  all  the 
environment  and  tool-related  software  and,  if  required,  to  restore  the  integrity 
associated  with  that  software. 

8.  Trusted  Distribution  [DBG6 1 0-8/A/OP] 

.All  software  should  be  delivered  to  the  departments  of  the  establishment  in  a  manner 
that  ensures  that  the  integrity  has  not  been  compromised. 

9.  Intrusion  Detection  [DBG6 10-9 'A  /■-' 

Audit  records  should  be  used  rt)  t.eribrm  periodic  and  random  intrusion  detection 
analysis  on  the  software  dcveiopnicnt  department. 

10.  Administration  [DBG6 1 0- 1 0  A/OP] 

The  software  development  department,  as  well  as  the  tool  software  and  the  developed 
application  code  should  be  maintained  according  to  explicit  administrative 
documentation  by  experienced  administrators. 

11.  Environment  and  Tools  [DBG6 10- !  1/A/TLS] 

All  tool  software  should  be  selected  according  to  an  explicit  selection  policy  that 
considers  the  maturity  and  development  course  of  the  software. 

12.  Least  Privilege  [DBG6 1 0- 1 2/A/OP] 

Privileges  to  perform  controlled  software  development  operations  should  be  allocated 
and  maintained  so  that  a  privilege  is  only  given  to  establishment  personnel 
individuals  who  require  that  privilege. 

13.  Multi-person  Control  [DBG6 1 0-13/ A/OP] 

Controlled  development  operations  should  be  completed  with  the  active  endorsement 
and  involvement  of  more  than  one  experienced  software  developers. 

14  Security  Policy  ]DBG6 10- 1 4/.A/OP] 

All  controlled  software  development  operations  should  be  performed  in  accordance 
with  an  explicitly  defined  and  enforced  security  policy. 

15.  Shared  Knowledge  ]DBG6 1 0-15 /A/OP] 

At  least  two  experienced  individuals  should  be  thoroughly  familiar  with  all  aspects  of 
the  requirements  and  design. 

16  .Software  Reuse  [DBG6 10- 16  A 'CPU] 

Any  software  reused  during  development  should  be  selected  according  to  an  explicit 
reuse  policy  that  takes  into  account  maturity  and  facilitates  for  building  object  code 
from  reused  source. 

/  7.  Planning  ] DBG6 1 0- 1 7/ A/OP] 

The  characteristics  of  all  development  activities  should  be  described  in  a  Development 
Plan  (DP)  and  the  management  of  the  software  development  should  follow  the 
approach  described  in  this  SDP. 

IS.  Risk  Mitigation  [DBGd 1 0-1  S/A/OP] 

All  potential  risks  associated  with  the  development  approach  should  be  explicitly 
identified  and  a  risk  mitigation  strategy  should  be  documented  and  complied  with 
throughout  the  .software  development  life  cycle. 

7.20  DATABASE  DEVELOPMENT  -  2 

The  set  of  the  proposed  security  guidelines  for  each  specific  phase  of  the  design  and 
implementation  of  databases  is  the  following; 

i.  Preliminary  analysis  phase: 

1 .  Risk  analysis  [DBG620- 1 - 1 /PRE/MET] 
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r<isK  analysis  ot  the  target  system  should  be  performed  prior  to  any  database 
development  steps.  Rise  analpis  should  include  assets  identification,  identification  of 
vulnerabilities  of  assets,  estimation  of  the  likelihood  of  exploitation,  estimation  of 
expected  costs,  survey  of  new  controls,  savings  estimation  etc. 

ii  Database  system  requirements  analysis  phase: 

I .  Requirements  Analysis  Tools  [DBG620-2-1/REQ/TLS] 

Automated  requirements  analysis  tools  should  be  employed  in  order  to  provide 
requiicments  specification,  consistency  checking,  and  documentation  generation 
Requirements  Analysis  Review  [ DBG620-2-2/REQ/R  VW] 

Requirements  analysis  activities  should  be  performed  by  multiple  experienced 
development  personnel,  at  least  one  of  whom  is  not  directly  involved  in  requirements 
analysis 

3.  Requirements  Analysis  Documentation  [DBG620-2-3/REQ/DOC] 

The  characteristics  of  the  requirements  analysis  process  employed  and  a  rationale  for 
.ill  requirements,  should  be  documented 

3  formal  Requirements  Specifications  fDBG620-2-EREO^FRM] 

Requirements  should  be  specified  in  a  forma!  specification  framework 

Requirements  Traceability  / DBG620-2-4  REQ/ IRC j 

.All  icquircmcnts  should  be  directly  traceable  to  an  explicit  user  source. 

lii.  Prototyping  phase: 

/.  Prototyping  Approach  [DBG620-3-l/PRO  'MET] 

Prototyping  should  be  performed  according  to  an  explicitly  defined  prototype  plan  that 
describes  the  manner  m  which  the  prototype  is  designed,  developed,  tested 
documented,  and  protected. 

.r.  Prototype  Code  Reuse  [DBG620-3-2  PRO  (RJJj 

It  prototype  code  is  reused  in  the  developed  code,  then  the  prototyping  approach 
should  be  taken  into  account  in  the  measurement  of  software  trust. 


iv  Database  system  design  specification  phase  (includes  conceptual,  logical  and 
physical  design): 


/  Design  Tools  [DBG620-4- 1  DES/Tl  Sj 

Design  tools  should  be  employed  to  maintain  design  requirements  traceability 
mappings  and  to  generate  design  documentation. 

2.  Design  Review  IDBG620-4-2  DES  RVW] 


•All  design  decisions  should  be  reviewed  by  multiple  experienced  software 
^  I  cvciopmcnt  personnel,  at  least  one  of  whom  is  not  directly  involved  m  the  design 
'  lE'srgn  Documentation  I DBG02()-4-3  DES- DOC/ 

I  he  characteristics  of  the  design  process,  design  alternatives  considered,  and  design 
rationales  should  he  documented 


4  hornvtl  Specification  [DBG620-4-4  DES  FRMj 

The  design  should  be  specified  in  a  formal  specification  framework 
-s  Design  7  raccahility  [DBG620-4-3  DES  TRC/ 

■All  a.spccts  of  the  design  should  be  demonstrated  m  all  design  documentation  to  be 
traceable  to  an  explicit  set  of  requirements. 


V,  Coding  phase: 

/  Coding  Standards  [DBG620-5- 1  COD  STD] 

Ail  explicitly  defined  coding  standard  that  enforces  modular,  structured  programming 
should  be  complied  with  throughout  the  coding  activity. 

Code  Analysis  Tools  [DBG62d-5-2  COD  TLS] 

All  developed  code  should  be  subjected  to  code  analysis  using  tools  that  measure 
complexity,  style,  and  potential  programming  violations. 

.V  Code  Review [DBG620-5-3  COD  RVW] 

All  code  should  be  reviewed  by  multiple  experienced  software  development  personnel 
at  least  one  of  whom  is  not  directly  involved  in  coding. 
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4.  Code  Documentation  [DBG620-5-4/COD/DOC] 

The  characteristics  of  the  coding  process,  the  module  organisation,  criteria  used  for 
module  decomposition,  and  source  code  comments,  should  be  documented 

5.  Code  Traceability  [DBG620-5-5/COD/TRC] 

All  code  should  be  shown  in  the  source  code  documentation  to  be  directly  traceable  to 
an  aspect  of  the  design  and  to  a  set  of  requirements. 

vi.  Database  system  testing  phase: 

1.  Test  Strategies  [DBG620-6-1/TES/MET] 

All  test  and  integration  tasks  should  include  provision  for  security,  functional, 
penetration,  regression,  and  random  testing. 

2.  Test  Responsibility  [DBG620-6-2/TES/RES] 

The  responsibility  for  testing  should  be  placed  with  an  independent  body,  not  directly 
involved  with  coding  or  design. 

4.  Reliability  Measurement  [DBG620-6-3/TES/RELJ 

Test  results  should  be  used  to  reduce  observed  software  flaw  densities  to  an  acceptable 
level. 

4.  Testing  Toots  [DBG620-6-4/TES/TLS] 

The  software  development  department  should  include  an  automated  testbed  for 
creating,  executing,  documenting,  and  analysing  the  completeness  of  all  tests 

5.  Test  Review  [DBG620-6-5/TES/R  VW] 

All  tests  should  be  reviewed  by  multiple  experienced  software  development  personnel, 
at  least  one  of  whom  is  not  directly  involved  in  testing. 

6.  Test  Documentation  [DBG620-6-6/TES/DOC] 

The  characteristics  of  the  test  process  should  be  documented. 

7.  Test  Traceability  [DBG620-6-7/TES/TRCJ 

All  tests  should  be  shown  in  the  test  documentation  to  be  directly  traceable  to  explicit 
aspects  of  the  code,  design,  and  requirements. 

vii.  Database  system  verification  phase: 

/.  Design  Verification  [DBG620-7-t/VER/FRMJ 

A  formal  verification  should  be  performed  to  prove  that  the  formal  design 

specification  correctly  meets  its  requirements. 

2.  Code  Verif cation  [DBG620-7-2/VER/FRM] 

A  formal  verification  should  be  performed  to  prove  that  a  low-level  formal 

specification  of  the  code  correctly  meets  its  requirements. 

3.  Verification  Responsibility  [DBG620-7-3/VER/RES] 

The  responsibility  for  verification  should  not  be  placed  with  the  organisation 
responsible  for  testing. 

4.  Venf cation  Tools  [DBG620-7-4/VER/TLSJ 

Design  and  code  verifications  should  be  performed  with  the  assistance  of  an 
automated  verification  environment. 

T.  \  'crif cation  Review  [DBG620-  7-5/VER/R  VWj 

.411  verification  results  should  be  reviewed  by  multiple  experienced  software 

dc\  elopmcnt  personnel,  at  least  one  of  whom  is  not  directly  involved  m  verification. 

6.  Verif  cation  Documentation  [DBG620-7-6/VER/RVW] 

The  design  and  code  verification  processes  employed  and  all  assumptions  required  to 
interpret  the  verification  results  should  be  documented  explicitly. 

7  Verification  Condition  Traceability  [DBG620-7-7/VER/TRCJ 

All  verification  conditions  should  be  shown  in  the  verification  documentation  to  be 
traceable  to  explicit  aspects  of  the  requirements,  design  and  code. 

7.30  CONTROL  OF  DATABASE  SOFTWARE 

i.  General  precautions 

1.  Demarcation  [DBG630-1-I/GPR/MET] 

A  strict  demarcation  must  be  maintained  between  the  operational  versions  of 
database  software  and  programs  and  their  test  equivalent.  These  should  be  kept  in 
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libraries  The  operational  lil)raries  ina.si  cxisi  m  both  source  and  nbicct 
tonii  Iho  operational  form  librarv  should  be  used  for  the  live  operations  and  be 
t:alii.’d  up  as  necessary  by  the  users  for  on-line  or  batch  operations.  The  operational 
sourc  librarv  should  exist  only  as  a  means  of  support  for  the  object  library 
Database  programmers  and  other  development  staff  should  never  normally  access 
iiie  operational  library. 

2  .Nev  systems  [DBG630-1-2/GPR/MFT1 

Ncv\  systems  and  changes  to  live  programs  will  be  lodged  on  the  test  source  library 
iDBCFGO  /-/).  Only  when  the  tests  on  the  new  version  arc  completed,  will  it  be 
ransterred  bom  the  test  to  the  opevationai  librnrics.  This  must  always  he  done 
through  a  forma!  procedure 

1  ripcrational  v  ersions  [DBCib’O  i  .t'Cd-RtMFTj 

( op of  the  operational  "crsion  of  tlie  source  -ode  ot  each  operational  database 
program  should  be  kept  until  any  new  version  lias  been  established  as  workinc 
correct  I V  ^ 

1  Hack  up  j  DBt  rb  it)- 1  -4/GPR  MFT j 

up  oi  tiatabasc  software  is  ,0.  nnpoitant  as  the  back  up  of  the  database  data, 
t  opics  ot  daiiibasc  software  shc>uid  include  biUh  operational  and  test  libraries 
I  'hangc  log  |  DBt  .'6.30- 1  -5/(iPR  Mb  f  | 

\  log  should  be  kept  of  all  change^  to  ihr  operational  software  libraries.  Usually  this 
ili  l  Onsist  <d  all  the  foimai  anthontics  to  transfer,  kept  in  a  file.  Sufficient  details 
ot  v  hv  anv  idianccs  were  made  and  the  date  of  transfer  should  be  recorded. 

1!  Software  development 

I  >H  soltv-nre  specifications  |r)B(i63i)  .2  I  SUlX  ' FRM  | 

All  database  software  written  in-housc  should  be  written  from  a  full  specification. 
When  the  speciiication  is  fust  drawn  up  it  should  be  agreed  by  the  database  users 
eoneerned.  or  their  representative,  and  signed  as  correct.  If  there  is  any  financial 
impact  the  database  auditor  should  also  agree  the  specification  and  add  any  controls 
that  are  necessary  from  the  audit  point  of  view  fdiccks  should  be  made  during 
development  on  compliance  whth  the  full  specification  and  the  establishment's 
'■eeiintv  poliev, 

.-  DB  software  testing  [DBG6.i()-,y-2,  SDV  MFT| 

11  database  sotiwarc  written  in-housc  .should  he  thoroughly  tested,  'fhe  testing  may 
consist  ot  several  phases,  for  example;  the  programmer  testing  that  the  software 
does  what  Is  expected,  the  database  ariab  si  testing  that  the  performance  matches  his 
ovvii  ..■xpcetations,  the  super'  isiou  bv  a.notiicr  programming  specialist  to  ensure  that 
all  p.iihs  in  die  software  have  been  tcstcil.  the  checking  of  the  results  by  the  end 
Users,  the  cheeking  of  the  results  bv  the  dalabasc  auditor  (  if  anything  financial  is 
li'v  oh  ed  i,  etc 

'  DH  si'tU' are  niamtcnancc  I  DHt  i().Mi-j t  sn\  Ml-'i  | 

Xs  iniK.li  caic  Uiould  be  taken  o\  e;  nuiinicnanee  ('f  database  software  as  for  its 
it.  v  i,  lopnien!  I  he  same  standauls  of  icsiing.  cheeking,  documentation  and  sign-off 
shoulci  appiv  ui  uiaintcnanee  as  lo  pure  dev  clopmeiii 

!  10111  ihi  sc'-uiif  point  of  vu"-  we  mtiv  be  'hink  oj  ihrec  classes  of  maintenance, 

"  nh  ilw  corresponding  seeiint;  euidelincs;  concctive.  adaptive  and  cncliancing. 
t  ..’Meitive  maintenance  is  icqnired  when  the  soitware  is  found  to  be  not  in  keeping 
with  the  specification  or  the  real  needs  of  the  database  users,  or  when  an  error  has 
Ivcn  discovcied,  .Adaptive  mamtenanee  is  required  when  database  software  needs  to 
be  adapted  lor  changed  circumstances  while  basically  performing  the  same  function, 
finally,  enhancing  maintenance  is  required  when  new  uses  or  better  services  arc 
desired  and  the  database  software  is  adopted  to  provide  them. 

4.  Corrective  maintenance  [DBCi6:30-2-4/'SDV/STD! 

It  IS  important  that  corrective  maintenance  takes  place  and  errors  arc  corrected  as 
soon  as  possible.  There  is  no  need  for  end  users  or  the  database  auditor  to  be 
involved,  as  the  maintenance  is  only  required  to  provide  what  should  have  been 
there  m  the  first  place.  However,  throughout  testing  is  required  and  the  results 


should  be  tested  by  someone  else  other  than  the  programmer  changing  the  software. 

A  log  must  be  kept  of  the  change  and  the  data  and  time  that  it  was  effective  in  case 
of  errors  discovered  later. 

5.  Adaptive  and  enchancing  maintenance  [DBG630-2-5/SDV/STD] 

When  maintenance  is  required  for  those  reasons,  more  formal  procedures  should  be 
followed.  For  example,  the  originator  of  the  maintenance  (e.g.  the  database  user) 
should  be  required  to  complete  an  amendment  request,  which  must  be  agreed  by  the 
database  auditor. 

6.  Copies  [DBG630-2-6/SDV/MET] 

Measures  must  be  taken  so  that  where  multiple  copies  of  database  software  are  in 
use,  they  remain  consistent.  End  users  should  only  be  permitted  to  maintain  their 
own  applications  software  after  specific  agreement  with  the  database  administrator, 
and  only  if  they  hold  the  single  instance  of  this  software  and  they  are  able  to  take 
responsibility  for  any  changes. 

7.40  DATABASE  ORGANISATION  AND  OPERATION 
i.  Database  organisational  and  administration  guidelines 

1.  Scope  [DBG640-1-1/ORG/OP] 

Database  system  security  considerations  should  take  into  account  all  system  software 
and  hardware  that  touches  information  flowing  into,  and  out  of,  the  database. 
Example;  an  easily  penetrated  operating  system  would  render  a  superbly  protected 
database  management  system  useless. 

2.  Database  security  policy  [DBG640- 1 -2/ORG/MET] 

-A  well  defined  database  security  policy  should  be  established  before  the  database 
goes  to  operation.  The  database  security  policy  should  provide  adequate  guidance  on 
issues  like:  database  security  policy  administration,  database  security  policy 
specification,  database  access  control  specification,  database  information  flow  control, 
enforcement  of  control,  etc.. 

3.  Database  administration  [DBG640-1-3/ORG/OP] 

The  position  of  a  database  administrator  should  be  filled  prior  to  the  implementation 
of  the  database.  Appropriate  decisions  on  his  selection  criteria,  duties  and 
responsibilities  should  be  taken  in  the  early  stages  of  database  development. 

4.  Access  control  [DBG640-l-4/ORG/OPj 

A  detailed  access  control  policy  should  be  specified,  so  that  database  users  are  allowed 
to  access  only  authorised  data  and  so  that  different  users  can  be  restricted  to  different 
modes  of  access.  The  database  access  policy  should  include:  database  access  control 
administration,  database  access  control  specification,  database  access  control 
enforcement,  etc.. 

.'i.  Integrity  [DBG640-1-5/ORG/OPJ 

Data  integrity  is  a  key  requirement.  The  integrity  of  the  database  should  be  maintained 
so  that  the  database  data  arc  immune  to  physical  problems,  the  structure  of  the 
database  is  preserved  and  the  data  contained  in  each  clement  is  accurate.  The  user 
must  be  able  to  trust  the  system  to  give  back  the  same  data  that  is  put  in  the  system 
and  to  permit  data  to  be  modified  only  by  authorised  users.  At  the  very  least,  the  user 
should  know  if  the  data  was  corrupted, 

6.  Confidentiality  [DBG640-1 -6/ORG/OP] 

The  database  system  should  provide  an  adequate  level  of  confidentiality/secrecy 
(prevent  unauthorised  disclosure)  and  yet  preserve  availability  and  integrity  by  using 
appropriate  controls  (e.g.  referential  integrity). 

7.  Availability  fDBG640-l-7/ORG/OP] 

Database  users  should  be  able  to  access  all  the  database  data  for  which  they  are 
authorised.  This  implies  system  fault  tolerance  and  redundancy  in  data,  software  and 
hardware.  Inference  and  aggregation  must  be  studied  and  controlled.  A  specific 
availability  policy  should  be  developed  which  will  deal  with  problems  like  arbitrating 
two  users'  requests  for  the  same  data  item,  withholding  some  non-protected  data  in 
order  to  avoid  revealing  protected  data,  etc.. 

8.  People  [DBG640-1-8/ORG/OP] 
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rc'ipl:;  yrc  criiciai  in  securiiy.  Trusted  individuals,  such  as  operators  and  programmers, 
mils:  ho  carefuliy  selected  because  of  their  potential  ability  to  affect  the  computcr 
I database:)  system  and  all  computer  users.  Well  defined  procedures  and  rules  for 
employee  selection  must  be  developed  and  used. 

Tducation  and  awareness  [DR(t640- I -0 'ORG/OPJ 

Suitable  training  programs  and  seminars  must  be  planned  for  all  types  of  database 
users  Fducation  and  awareness  programs  must  take  place  on  a  periodical  basis.  There 
should  be  different  such  programs  for  every  major  type  of  users  (e.g.  ,  technical, 
administration  users,  etc.). 


Database:  operational  guidelines 

1  ,v,v'  ,iiiihi.'ntiv.ai!on  [DBG640-,.:- 1  ( iTF.R  <  ,iPi 

rim  f  iRMs  sT.mid  be  designed  lo  perlorm  Us  own  authentication  at  the  required  level, 
m  addition  to  ihc  authentmatiori  pwformed  by  the  operating  system  Specific 
inliicntieatinri  procedures  should  hr  impicmcntcri  according  to  the  specific  needs  and 
sceunty  i  eqmrcincnts  of  the  particular  cslabhshmcnt. 
duduabilitv  |nBCi640-2-2tOPERT)pi 

Appiiipriatc  audii  proccduiTs  shoiiki  tie  developed  so  fiial  it  will  be  possible  to  track 
niaudionscd  :Ki'csscs  or  moddicatioMs  oi' t.ie  elements  of  the  database.  ,‘\udit  should 
iv-  deluded  cuoiigh  to  be  useful  and  ^ut^iclcn;  enough  so  as  not  to  sc\crciy  burden  the 
O  'Ueni  performance. 

inicrence  i-onlod  |DBG640  2G  OPFR'OPI 

\n  inference  prevention  pohco'  should  be  sneerfied.  This  should  include  issues  like; 
ilafubdse  infcicncc  prevention  administration,  database  inference  prevention 
specification  database  inference  prevention  control,  etc. 

Database  recovcri-  [DBG64(:'-2-4  OPFR 'OP| 

A  dciailcd  database  recovery  policv  should  be  specified.  This  policy  should  define: 
database  iccovci  y  prevention  procedures,  database  recovery  administration,  database 
recovery  specification,  database  recovery  control,  etc,. 
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